Were they on star trek!?

Screenshot of the website

I made a website that was largely just from my own curiosity as a Star Trek and general TV/Movie watcher.

It happened by me remembering if I saw a character from Better Call Saul (and Breaking Bad), Tuco Salamanca (played by Raymond Cruz) anywhere in Star Trek. Why I had the random inclination to know this, I can't say.

This guy on the left over here:

Well, turns out he had a small role in Star Trek Deep Space 9!

To me, it was interesting to see how many actors got their breaks on Star Trek. It makes sense, as Star Trek is known for its weekly rotation of one-off characters.

How it works

You can enter in whatever show, actor, or movie you're curious about, and the site will list out any intersecting actors/appearances between said media entity and the Star Trek universe at large.

The only items it's comparing against are canonical entries in the series.

The Star Trek canon includes the original series, seven spin-off television series, three animated series, and thirteen films.

The app uses TMDB as well as Memory Alpha to source data. Nothing fancy, just a periodic crawl that merges both together into a big json. Only actors are considered as potential overlaps. While the TMDB also has a crew roster, I felt that for most people that kind of granularity was too much.

Popularity Score

I used Supabase to collect user searches. The first page displays a list of popular searches. These are added to a Supabase and are only ever upserted, so after an initial entry is created, any subsequent hits increases a counter. Everything is anonymized, and uses a simple rpc

sql
BEGIN

INSERT INTO popular_searches(media_id, name, type, image)

VALUES (media_id_to_upsert, name_to_upsert, type_to_upsert, image_to_upsert)

ON CONFLICT (media_id)

DO

UPDATE SET hits = popular_searches.hits + 1;

END

"Partial" searches are never honoured, only when a user clicks into a media entity (like a tv show or movie), does the database get updated.

The list of displayed popular items is updated on build periodically, as having it be real time is a bit of an overkill at this point since there isn't much traffic on an ongoing basis.

Random Musings

It seems like sometimes the entire appearance list of an actor is not present on TMDB. For example, the entry for Jeffrey Combs in DS9 gives only Brunt or Weyoun / Officer Mulkahey, but never just Weyoun. The latter character appears in many episodes of DS9, but the API is only counting...limited/hybrid appearances. Not sure why this is the case..

As it turns out, the TMDB database itself was incomplete on certain appearances. It relies on user lead corrections and entries; so any registered member is capable of making changes. As this project goes on, I suppose it may lead to a richer TMDB database compendium of star trek entries; mirroring data found on Memory Alpha.

Also after going with the angle of, lets crawl MemoryAlpha and fill out character data (which takes like 9 hours by the way); I discovered that the site has a db dump. However it did not contain direct links to images, therefore it is largely useless since I couldn't decipher the standard format of all the stuff that comes before the fileName that might be listed in the aforementioned xml dump.

See the url here:

https://static.wikia.nocookie.net/memoryalpha/images/9/9a/Q%2C_2364.jpg/revision/latest?cb=20220307152252&path-prefix=en

compared to:

https://static.wikia.nocookie.net/memoryalpha/images/3/38/Borg-symbol.jpg/revision/latest?cb=20091016022015&path-prefix=en

While we have the fileName in the xml dump, we don't know what folder it can be in, 9/9a etc. So the strategy of crawling Memory Alpha stands for now. The only things to look out for are missing characters or non-canonical names.

Reddit

At some point I posted this to Reddit here and within 24hrs it got about ~15k impressions, and the popularity data set had about ~2k requests made to it.

One interesting outcome of pre-populating the "most searched" with my own data to start with was that rather than it reflect what the subreddit truly was interesting in, it actually just attenuated the "fake" top searches so that they became the real top searches. I guess I should have wiped the db clean before hand to ensure there was no biasing.

Another fun meta game people came up with was trying to find the most overlapped show. Currently its CSI according to this comment.

I hope the app is useful, and as time goes on, the delta between non-canonical names listed on TMDB vs Memory Alpha closes.

Weird stuff

Something anomalous occurred later during this projects run. Actor Jeffrey Combs seemed to be disproportionately searched for. Going from 98 hits to 298 over the course of 4 days. I would have expected other data to rise at the same rate, a tide that was due to more popularity (like someone posting the site to another social media platform). I told my wife about this and she said she had the tab open of said actor on Firefox for iPhone. She didn't use that tab, but I think each time she opened the browser, she was inadvertently loading up and resetting the state of the page, thereby counting as a hit. I count hits as page loads of actors and tv shows, and they were somehow still being picked up even though the tab should have been inert.

So, the 'uptick' was basically just a trend of how many times my wife opens up Firefox....oh well.