I'm a PhD student at MIT on a Fulbright Future Scholarship advised by Sandy Pentland in the Human Dynamics group at the MIT Media Lab and Connection Science in the Institute for Data, Systems, and Society.
I work at the intersection of data, privacy, AI, and society. In particular, I'm focused on how communities and individuals can privately and securely share data to extract social-good insights, rebuild social capital, or securely leverage large language models. This takes many forms, from decentralized private data sharing to secure data pooling for communities. I'm optimistic that a decentralized, pluralistic, and composable Web3 & AI future will be great, and I'm working hard to make sure I'm not wrong.
Previously I did research as a data scientist and applied mathematician at The University of Adelaide where my research focused on networks of natural language information flow. I also ran the Bayesian arm of the COVID-19 forecasting for the Australian Federal government and worked as an RA on a variety of projects. Oh and I have a few failed startups from over the years, because you've got to learn somehow.
I care a lot about governance and have a bad habit of agreeing to too many meetings. I have a non-executive role on the board of the MIT bigdata Living Lab in Adelaide and sit on the Graduate Student Council at MIT. I was involved in far too many committees and student organizations throughout my Bachelors, where I was Valedictorian of the graduating class of Mathematics, Computer Science, and Electrical Engineering.
Before all that, I grew up in the "Peanut capital of Australia", the agricultural town of Kingaroy, Queensland; but I consider Adelaide my real home. Don't hesitate to reach out if you have something interesting to talk about.
An secure & privacy preserving approach to community data pooling for auditable information retrival in LLM use.
"A Decentralized Society Needs An Offline" Public talk at the MIT Media Lab's Fall Members Week.
"Building a Healthier Feed: Private Location Trace Intersection Driven Feed Recommendations" working paper.
A talk for the Adelaide Data Science Centre on the Future of Data Science,
Tobin South, Bridget Smart, Matthew Roughan, and Lewis Mitchell. Information flow estimation: A study of news on Twitter
Are we always in strife? A longitudinal study of the echo chamber effect in the Australian Twittersphere
Official Government Forecasting COVID-19 cases during the pandemic to inform policy response. Using a hierarchical Bayesian model to learn state level priors for the effective reproduction number. Temporal posterior estimated using mobility indices and social distancing surveys which in turn allows for outbreak simulation. Github Repository Link.
Developing spatial risk mapping tools to identify COVID-19 risk using Facebook and Department of Transport mobility datasets. Developed using Plotly Dash in collaboration between UofA, UMelb and DSTG.
A package to calculate sequence level cross entropy rates (and self entropy rates), designed to by extremely fast for large text datasets. Useful in estimating entropic information flow. Built using python and numba.
My Thesis titled "Non-parametric Information Flow Estimation in Social-Media News" submitted for my Master of Philosophy in Applied Mathematics and Data Science at the University of Adelaide.
Pond, Tyson, Saranzaya Magsarjav, Tobin South, Lewis Mitchell, and James P. Bagrow. "Complex contagion features without social reinforcement in a model of social information flow." Entropy 22, no. 3 (2020), doi.org/10.3390/e22030265
Tobin South, Matthew Roughan, Lewis Mitchell, "Popularity and centrality in Spotify networks: critical transitions in eigenvector centrality", Journal of Complex Networks, Volume 8, Issue 6, 1 December 2020, doi.org/10.1093/comnet/cnaa050
Roughan, Matthew, Lewis Mitchell, and Tobin South. "How the Avengers assemble: Ecological modelling of effective cast sizes for movies." PLoS one 15, no. 2 (2020): doi.org/10.1371/journal.pone.0223833
Nguyen, Andrew, Tobin South, Nigel Bean, Jonathan Tuke, and Lewis Mitchell. "Podlab at SemEval-2019 Task 3: The Importance of Being Shallow." In Proceedings of the 13th International Workshop on Semantic Evaluation, pp. 292-296. 2019.
I've had opportunities to both teach and design courses and have loved every moment. This has included roles as joint-lecturer for a postgraduate computer science course, practical laboratory tutor for data science courses, and workshop tutor for professional skills courses. Here is a few pieces of student feedback from the anonymous assessment of teaching quality surveyed each semester.
Very well versed in material, very outgoing and friendly, easy to talk to and genuinely interested in helping our improvement and learning.
He teaches with enthusiasm. He is always excited to teach new concepts. He clears everyone’s doubts with patience.
Tobin took an incredibly dry and tedious subject, and turned the tutorials into something I actively looked forward to attending. He was able to explain the concepts clearly, and made participation feel comfortable.
Please let Tobin be the main Lecturer. It could save students.
In early 2019 I co-founded an analytics startup which was acquired by AI consulting group Brainframe where I transitions to consulting work specialising in NLP and open data. In mid 2020 I co-founded a small healthtech startup digitising public sector medical consent, which was founded after winning a medical hackathon in Queensland. I've since moved on from both of these organisations.
Choose your attitude.