Between Practicality and Morality of Datahoarding

r/DataHoarder - Data Hoarding is Okay

As I write this, my 100 EUR home server is downloading an educational channel on YouTube (+100 videos) and several Southeast Asian recipe websites. Why?

Because digital content is empharmal: "A quarter of all webpages that existed at one point between 2013 and 2023 are no longer accessible". And with that, so goes so much of our culture and knowledge. And it's even worse for older content (see graph).

A line chart showing that 38% of webpages from 2013 are no longer accessible

The recipe websites I'm downloading currently makes their money from ads. But my frequent visits to these websites are no gurantee that it'll be around tomorrow. Companies like Google routinely shutdown personal websites it considers inactive. And the success of jouranlistic and media websites are fleeting, just take a look at Vice Media, Buzzfeed, and Gawker Media. At sudden moment, a writer's work vanishes from the internet.

If I could buy the content - these recipe books and education websties -  I think I would if it meant actual ownership (not at the mercy of Amazon deleting book I bought from my own device). 

And what happens if/when these websites disappears and a friend wants a copy of my char siew pork recipes or video about portfolio management? Perhaps my copy is the only copy that still exists? Wouldn't sharing that content to my friend be a copyright violation? In some countries, yes. But letting culture and knowledge die for the sake of copyright in such as a case, seems like a moral violation.

I don't know what the answer ares, but in the meantime, here's a screenshot of my little server chugging along and keeping knowledge and culture a little bit safer.

(If you're interested in how, Google around "wget" and "yt-dlp". I may get around to putting out a video about this, if you're interested. Drop me a line and let me know)

This article was updated on