How I would Improve the ArXiv


Firstly, I should apologise to those who have no idea what the arXiv is.  I know that a lot of my colleagues are readers of this blog, and this will definitely mean something to them.  For the uninitiated – the arXiv delivers (as it says on the main page)

Open access to 665,854 e-prints in Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance and Statistics

That probably doesn’t sound that impressive to some people, but focus on the words “Open Access”.  The arXiv occupies quite a special place in the above physical sciences – it is an extremely comprehensive collection of scientific articles in the field, and they are completely free to access anywhere in the world.  Articles in regular journals are typically not free: academic institutions pay for subscriptions so their members can access them, but that requires them to be at their work desktop (or using a delicately configured laptop if they’re out of the office).  You can access arXiv papers from any computer (even your smartphone if you so dare).

This archive of papers reached critical mass almost a decade ago – now, pretty much every paper published in the appropriate subjects (and I would say damn near 100% of all astronomy papers) appear in pre-print form on the arXiv.  Also, because there is something of a lag between a paper being accepted by a journal and its subsequent publication (typically a couple of months) posting to the arXiv gets your idea out there very quickly.  If you’ve got hot-off-the-press scientific data and other teams might be working on it, then the arXiv allows you to beat them to the punch months quicker than before.

This is all brilliant, and I wouldn’t change it in any way.  What I would do to the arXiv is add one thing.  If you look at my arXiv post, then you’ll see in the bottom right corner it has 5 blog links.  This is the arXiv’s trackback system, and allows you to see what people have written about any particular paper (if anything).  If you’re outside the field (e.g. you’re a journalist), then this is a chance to find a layman’s description of the paper.  This is great for communicating cutting-edge science to the public.

Except the blog links are almost never written by the author.  If you’re looking for a simple description of what the paper is trying to do, then you could be going down a blind alley by checking out a blog link.  The blogger you end up at may have an undisclosed bias, or simply be ignorant.  I’m not saying you shouldn’t read the blog links, but it would be good to have a simple description from the horse’s mouth.

So here’s what I propose: the arXiv present authors with an Authorised Blog Option.  If the authors select this option when uploading their paper, they write a short blog, in simple terms, about the paper – what they’ve done, what they found out, why it is important.  Even 300 words would be enough.  Yes, it’s true that academics will probably not go to the extra effort of writing “blurb” for every submission, but you could provide an incentive.

I know quite a few people who go to extreme lengths to make sure their submission is at the top of the page for any given day.  To get to the top, you have to submit the paper at a very specific time.  The deadline for submissions is 4pm Eastern Time – if you submit before 4pm, your submission will appear the next day.  If you submit at 3:59pm, then you end up at the bottom of the list for the next day.  If you submit at 4:01, then you’ll be at the top of the list in two days’ time.  This sounds ridiculous, but people actually time their submissions very carefully to do this.

So how about this for an incentive? You provide an Authorised Blog, you get list privileges.  Your submission goes to the top of the list, regardless of what time you submitted.  Maybe you have two lists, one published earlier than the other, ones with blogs and ones without respectively.  You could have other incentives too – blogged papers could be added to an email subscription service for interested users, recommended blogs could appear on the main page (with links to the paper) for several days after submission, etc.

If enough arXiv papers have corresponding blogs, then the whole thing could reach a new critical mass.  Instead of papers being misreported without quoting their primary source, journalists will not be able to ignore statements made by the authors themselves about their own research, and the public will be able to ignore dodgy reporting and get their physics fix straight from the primary source.  It would be a complete overhaul of how such science is disseminated in the media.  Also, it would allow much more cross-pollination of the scientific method.  I suspect there are techniques we use in Astronomy that Quantitative Finance would be interested to know about, and vice versa.  Multi-disciplinary research would become much easier if we had access to simple descriptions of others’ work – Authorised Blogs could be the perfect solution.

So that’s my idea…what do you think? ArXiv, are you listening?

Advertisements

6 thoughts on “How I would Improve the ArXiv

  1. Whilst I’m in favour of there being non-expert summaries of papers, I suspect this system would quickly be gamed. Most authors are busy and don’t necessarily have much practice in summarizing their work in simple terms. I think they would simply copy and paste the abstract they’ve already written to get the benefits (going higher up the list) without needing to do any extra work.

    1. Yeah, I’ve cooled my jets somewhat on this one. You’re right in that this system would be gamed – you would need to have someone moderating post quality to avoid this. I still think non-expert summary sections are a useful feature to have, but I would maybe incentivise them differently (e.g. having a front page to the arXiv which aggregates these miniblogs like some Twitter “newspapers” do)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s