5 minute read

When I had the Holophrase website, I used to write articles that published all of my reports and term papers from my Master’s degree. On GitHub, I’ve OSSed everything that could find a home there too, and it won’t be considered sharing homework by the academic staff. Last year, I wanted to start and publish the reading I did for my thesis. I gave that a go, but my thesis work was interrupted a few times over longer periods and it was difficult to keep up. I read so much though, mostly social psychology which is one of my favourite topics. Do not ask me about it if you meet me for your own good (or do? but consequences).

At the beginning of this month I started a PhD degree, and a job as a research associate at a research institute. I am still figuring out what that means. Currently, what is mostly on my mind is how to pair what I am really excited to spend a part of my life obsessing over with what would benefit society. It is not “change the world” kind of ambition, but more like if being paid to think and discover, might as well leave something behind for others to use.

I haven’t landed here through the usual path, but I am here now. It was never possible to land here through the usual path, thus these kinds of posts. I would love to make as much of my daily reality transparent again.

Below are a bunch of papers and other learning materials that came up last week. I spent 80% of it on programming, so there aren’t too many. This week is kind of mixed, less programming, but more conversations coming up. As the first month comes to a close, some threads of involvement into internal and external projects are taking shape. I, of course, have a hard time kicking the tech industry habit of being productive (in a very narrow sense of the word meaning bringing value to stakeholders) from day one. You can take a girl out of the industry, but not the industry out of the girl.

I have not used any technology to edit or fix these sentences, so enjoy the raw thoughts :).

Materials

I read the “positive alignment” paper by Laukkonen et al. (2026). I wrote about it on LinkedIn, where it tanked with just a few hundreds of impressions from the 5000+ people I am connected with on there. I got the sensational first line and all that. You can contribute to those impressions rising here. TL;DR AI is posed as inevitability in every aspect of our future lives, and instead of the current approaches of harm assessment and mitigation, it proposes to optimise for values and morals that will contribute to human flourishing. Or so the paper says.

I am really excited to be able to participate in FAccT for the first time after years of looking at the conference from the outside, so I am reading papers that will be presented there that I come across from other channels. Apparently, the program isn’t available until days before, a stark difference with tech conferences. I read the think-aloud study by Gautam et al. (2026) on how researchers use AI in early-stage research, and the questions about accountability, transparency and trust that come up during that process. The most interesting part of the paper for me was the discussion about the added cognitive burden caused by the compensatory strategies accounting for the intransparency of sources and epistemic uncertainty when using these tools. The researchers utilised various strategies, such as restricting the use of AI tools to peripheral and organisational tasks, adopting heuristic and manual validation, and extending highly context-dependent trust that had to be continuously renegotiated. The authors end with some recommendations about how to make tools better and more usable for researchers, among which the argument for continuity and addition to existing workflows that create value instead of disruption. I never stopped to think until now about how disrupted are existing digital workflows at the moment, and what we lose by doing that, both financially, but also in terms of productivity and skill.

Traberg et al. (2026) comment on the way AI is turning research into a scientific monoculture. They mention the pull towards a common computational paradigm, and the reduction in methodological pluralism, or the methods scientists use, and its tranformation into “standardised analytic pipelines”. As a computational linguist studying and working in NLP before and during the LLM overtake, I, as most of my colleagues understand this well. A lot of the sadness a lot of us felt in 2023 was not because of change, but because of how it flattened a lot of the forms of knowing and discovering. This connects to what they name “meta-conformity”, when the object of study, namely AI in this case, starts to shape how we study it. They urge to think what is at stake, such as abandoning topics that cannot easily fit the computational frame, the loss of what alternate approaches could uncover, and the diminished resilience of the science community as a whole to pivot and shift their methods and attention with the times.

Uncovering yet another aspect in the LLM evaluation crisis, the brittleness of leaderboards and benchmark evaluations was the topic of the perturbation study by Kostić et al. (2026). Their study shows that augmenting existing benchmarks like MMLU by preserving the meaning and replacing words with synonyms, or applying syntax changes like sentence structure, changes the evaluation results. The models were more sensitive to lexical sensitivity. Kostić conveniently presents the paper in a video.

I read a lot of short blog posts around the process of software creation, of the vibes and shifts in the perceptions of developers as individuals, the product creation and placement in the industry, but mostly about practice and craft. Have you located yourself along the grief divide? You can informally dump some of your feelings on me here :). I will write about this topic maybe another time more in depth, as this question of process is creeping into my research interests.

Lastly, I explored the works of Lucy Suchman, and I came across this conversation she recently had with Terry Winograd. I ended up reading some of the work mentioned in that conversation this week, so I might tell you about it the next. The conversation is worth your time.

References

Laukkonen, R., Krier, S., Bakalar, C., Chandaria, S., Kringelbach, M., Elwood, A., Ford, D., Rosas, F., Bohacek, M., Franklin, M., Tomašev, N., Chan, S., Rieser, V., Patel, R., Levin, M., & Rao, A. (2026). Positive Alignment: Artificial Intelligence for Human Flourishing (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2605.10310

Gautam, S., Liu, H., Choi, Y., & Lease, M. (2026). How Researchers Navigate Accountability, Transparency, and Trust When Using AI Tools in Early-Stage Research: A Think-Aloud Study (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2604.23136

Traberg, C. S., Roozenbeek, J., & Van Der Linden, S. (2026). AI is turning research into a scientific monoculture. Communications Psychology, 4(1), 37. https://doi.org/10.1038/s44271-026-00428-5

Kostić, B., Fallon, C., Risch, J., & Loeser, A. (n.d.). Same Meaning, Different Scores: Lexical and Syntactic Sensitivity in LLM Evaluation.

Categories:

Updated: