I decided to play a little with Google’s
https://notebooklm.google.com/ tool. It allows to add several sources and then ask questions about the contents.
Let me explain my motivation. We have often heard here that things presented are incomprehensible or that they are even simply gibberish that is made to look clever. Language models do not ‘understand’ things. As we have explored before, it’s all just an algorithm that operates on numbers. In principle, we can execute this algorithm by hand with paper and pencil (even though it would take us years). For example, when we import the sources in the above tool, they are simply a long sequence of numbers. Then a question that can be asked is simply a few more numbers added to the string. Then we begin to execute the algorithm, which adds and multiplies the numbers in the string in different ways and patterns, weighs these results against a large statistical database (the language model itself) and as a result, a new number is added to the sequence. This process is repeated until all the numbers of the response are generated. There’s no ‘understanding’ here, no thinking, no insight. We can generate the whole answer number by number and we may never even suspect that these numbers can in any way be mapped to human language.
The reason the added numbers to the string (the response) seem meaningful is because they, together with the source text, fit the complicated statistics of the model and these statistics are derived from text that is already structured meaningfully (to us humans). If the model was built from text that was already gibberish – random sequences of words – then the statistics would capture that and the new numbers/words added to the string by the algorithm would reflect it.
So assuming that the models have been trained upon text that is meaningfully structured, we could expect that it can be a kind of measure for whether the text of the sources more or less satisfies these statistics.
This is what I wanted to see: whether the written text – in this particular case I'm experimenting with the text of the Inner Space Stretching essays – is at least statistically coherent. Not whether it makes deep sense that only the human spirit can grasp, not whether it is right or wrong, but simply whether the numerical relations between the words/tokens satisfy the statistics of well-structured human language.
The answer seems to be ‘yes’. This simply confirms what we have been saying all along – that these ideas are not resisted because they make no sense (even if only grammatical sense!) but because of purely human factors, such as antipathy, disinterestedness, and so on. To me, this was a kind of relief. Now I know that if nothing else, the words are at least grammatically consistent and form a kind of coherent numerical pattern, resonating with the statistics of the model. I repeat that this has nothing to do with whether the ideal message is right or wrong, but simply that, if nothing else, at least the words are not random gibberish and fit the statistics of well-structured human language.
The tool can not only respond with written text but can also generate a short audio, podcast-style summary of the sources. I must admit I was quite blown away by this feature. Not only that the content itself makes sense but also the quality of the generated voices is impressive. To be fair, if I had heard that without knowing, I wouldn’t have guessed that it was generated.
Here’s an example:
Link 1
Here are two more versions generated over earlier versions of the text:
Link 2,
Link 3
The voice quality and the articulation are really almost scary to behold. The summarization, of course, is not perfect but, to be honest, I don’t think it would have sounded much too different if it was made by any two random real podcasters who are not very familiar with the depth of the phenomenology we’re dealing with here. The summaries clearly reflect the tons of self-help and popular spiritual materials that the models have been trained on, but nevertheless, it manages to capture some of the essential points, which again, only shows that if nothing else, they at least make mechanically fitting sense.
For those who haven’t read the essays, I wouldn’t suggest that hearing these automatically generated summaries makes reading unnecessary. Remember that the words are not there to just make statistically plausible sense – that is, to remain as more or less fitting mental puzzle pieces in the intellectual sphere. Like an art form, they only fulfill their mission when grasped as a communication of a living soul to the inner experience of another soul. It’s not about making sense of the words like mere puzzle pieces but discovering for ourselves the living flow that the words describe.
As said, it is somewhat relieving that if nothing else, the words are at least mechanically fitting together. I could refer to this next time when there’s an argument about the random noise that inner phenomenology consists of. I hope that we can then move the conversation in a region where it should really belong. It’s like saying: “Look, blaming that the words don’t even make mechanical sense, is not really plausible. This can be seen from the fact that a computational algorithm, which doesn’t suffer from laziness or irrational antipathy toward the text, can pretty well match the numerical patterns against the statistical database of coherent grammatically sound human language. With this out of the way, let’s focus on the real issues – these inner forces that prevent us from approaching the meaning and its living experience, by trying to convince us that there’s no reason to even try read the text because we have apriori assumed that it makes not even grammatical sense.”
If anyone wants to experiment on their own, here’s a short how-to.
- I’ve concatenated together the essays in a Google Doc because it’s easier to import into the LM notepad. Open it here and add a shortcut to your Drive (the icon with triangle with a plus above the menu).
- Open https://notebooklm.google.com/, create a new Notepad, add a new Source, choose Google Docs, and select the document.
- Explore the possibilities.
I emphasize once again that this is not meant to promote the miracle of language models. No matter how many questions we ask the model, it will be all the same if these things remain only floating mental images. The goal is simply to help us face the real resisting forces within ourselves, instead of externalizing them.