Archive for the ‘Big data’ Category



February 2, 2016

How long does it take for an issue to fall from “current affairs” into “history” or to be forgotten altogether?

I ask because I had an odd experience while completing my tax return on Sunday afternoon (well of course I left it to the last minute – I’m a retired tax inspector, after all, and you know what they say about the dentist’s children’s teeth).

Because I had checked (and written a smug blog entry about it) that I was able to log onto the HMRC system in good time this year.  But when I sat down on Sunday morning and typed “HMRC self assessment” into google I didn’t get back to the expected page with my details already saved.  Instead I found myself in at a page headed “sign in and file your self assessment tax return” which had a link to “sign into your online account“… which did NOT have my login details already filled in as I had hoped.

Now I had, of course, taken the precaution of writing down my “HMRC User ID” (and my UTR) inside the front cover of my account book.  But I had not written down my password and it seemed my computer had not helpfully retained it in its memory and it was now 11am on 31st January and ouch!  And, incidentally, if you need a new password (which was my first thought) you can only get one if you agree to have an “online Government account email address” which I have so far refused to accept.  This is because I suspect that signing into a government email address will be as much a bore and a chore as signing into one’s self assessment account, and I utterly refuse to have legal notices like notices to file and reminders to file sent to an address which it is unlikely I will remember to log into.  To me, a reminder goes to, you know, the thing you actually look at like your ACTUAL email address.

But this is beside the point, which was that time was getting on and I still hadn’t managed to log into my self assessment account and it didn’t look as if I was going to be getting a new password any time soon enough to make a difference.  Aha!  I thought, I can follow one of the other links on the “sign in and file  your self assessment” page which helpfully offers the option of signing in with “a GOV.UK Verify account”

I don’t know what that is, I thought, but it sounds like something I should have.

So I went to this page and clicked on “this is my first time using Verify” and arrived… here.

Now, if you haven’t clicked on any links so far in this blog, I suggest you click on this one, because it tells you that

A certified company will verify your identity. They’ve all met security standards set by government.

A “certified company”.  Not HMRC.  Not any arm of the government.  A “certified company”.  They are:

  • Verizon
  • Experian
  • Digidentity
  • Post Office

I failed to register with the Post Office, and then I failed to register with Experian, mainly because I had already given them a remarkable number of details from my drivers licence and my debit card and they then wanted my passport details as well which I refused to give them.

I realise that 2006 is a long time ago, but do we recall the protests against the introduction of a national identity card scheme?  I seem to recall that the one of the principal objections was that it would enable government to join up different databases and put together an enormous mass of data about our individual movements and activities.  There was a campaigning group, NO2ID, which still seems to be operational.

I was never quite sure which side of the argument I was on.  I used to be a tax inspector, after all, so I could see just how bloody useful being able to join up government databases would be.

But to me, if there’s one thing worse than having a government identity card scheme, it’s having a privatised one.  Great flying spaghetti monster, I’d rather have a democratically elected government tracking me than… an American mobile phone company, a credit reference agency, a private Dutch company or the bloody Post Office!

(After lunch I tried again.  I googled “HMRC login”, which took me straight to this page, where my HMRC User ID and the password were already helpfully in place.  Phew!  And, yes, I’ve done my tax return, on time, thanks.  Inner peace my eye!)

So.  What do we think about Verify accounts?


Tax gaps

October 30, 2015

Last week we had publication of the 2015 iteration of the HMRC Tax Gaps figures with accompanying methodology.  Note the plural: HMRC Tax Gaps, plural, and not a singular “tax gap”.  We’ll come back to that.

Let’s start with the methodological annexe.  In it there are five methods described.  They are:

  • Data matching
  • Top down methods
  • Management information
  • Random enquiries
  • Illustrative

Now let me start by saying I’m not a statistician nor an economist.  I’m a retired tax inspector with reasonable numeracy but that’s it: no superpowers, sorry, just a large pinch of salt which I can’t help applying to anything anyone tells me.  I have been vaguely aware for some time of the various controversies about the size of the tax gaps so these five methods interest me.  Let’s take a closer look.

“Top down methods” seem to be a method useful for indirect tax (what we might call the ex-Customs taxes).  It seems reasonable to use external data sources to work out how much of a given taxable product is consumed.  Working out what the VAT/excise duty etc would be on that level of consumption is just arithmetic, and then looking at how much is actually collected gives you the gap.  The tax gap is then the difference between how much VAT you’d collect if it was properly calculated and paid over on all the (say) beer sold in the year and the amount actually collected.

“Random enquiries” is a phrase used in tax investigations.  While most HMRC enquiries are based either on intelligence or else on the statistical information and other anomalous data ground out of the department’s giant number crunching machines, there are a few which are based on, well, pot luck.  What better way is there to check the integrity of your target selection and the results of those targeted investigations than to check a few cases at random and see how they compare?  Personally I think there’s a strong case to be made for far more random enquiries (takes the heat out of the transaction, levels the playing field by making sure the hard cases don’t get screened out) but I think the data from random enquiries is a useful contribution to measuring the direct tax gap – if x% of the random cases have errors producing amount £y, you could extrapolate what that amount of tax you were missing across the entire population, all things being equal.

“Management information” is information taken from HMRC’s internal systems.  Now, this is where I throw my first pinch of salt into this.  I have worked in HMRC.  I have contributed at the grass roots level to the management information available in HMRC.  I have argued with managers over the years about the management information collected in HMRC, in particular when detailed data is required from busy people where no benefit accrues to them from its collection, such as in the old fashioned ways HMRC used to collect data about the use of time of its inspectors.  I am sure HMRC’s internal systems are operated with integrity and provide the best data they can provide.  But I also suspect they may sometimes produce the same kind of data as you get from opinion polling or question setting in Pointless.  In other words, without a great deal more information about what “management information” this refers to and how it is compiled, I’d take it with a large pinch of salt.

“Data matching” is described as “comparisons between related datasets” and I’m making that “whoosh” gesture with your hand over your head, to indicate that’s where this feels like it gets me.  Maybe it’ll become clearer further into the document?  Watch this space!

Finally there’s the category of “illustrative”, which is described as “where limited data is available, estimates are produced using assumptions made in collaboration with HMRC’s operational experts.”

Now, is it just me, or is that a polite way of saying that sometimes, where we don’t know, we just have to make a good guess?

Clearly calculation of tax gaps is going to be an…” interesting” topic: I plan to come back to it next week.

In the meantime, I’ll be at the launch of the Women in Tax network on Monday, “What do we mean by a fair tax system” and looking forward to it enormously!  If you’re there, come say hello.


Legislative gap? Legislative overload?

January 8, 2015

There’s a theory doing the rounds that the Coalition have run out of steam and are just coasting to the election date.  (7 May – in case you hadn’t heard!)

So where is the evidence?

You know where I’m going with this, right?  Yes, there are 2336 “publications” according to the front of the consultations page today, of which 125 are open consultations (and, check figure, 2209 show up as “closed consultations“.  What are the other two???)

It’s January 8th today, the Thursday of the first full week back at work after the New Year for most people.  Quiet time, right, for catching up and getting organised?

Filter by “open consultations published after 1/1/2015” and you’ll find nine publications, ranging from how to comment on open access restrictions at Bickerley Common (which is a place in Hampshire where people walk their dogs but where birds like Bewick’s swan need to be protected from, er, eutrophication, whatever that is) to the Competition and Markets Authority’s draft Welsh Language Scheme which is out for consultation here (and which worried me a bit till I realised it was also out for consultation here in actual Welsh!)

Is that a lot, in a week?

I don’t know.  There were ten publications between 1/1/14 and before 9/1/14, and more organisations are migrating their web presence to all the time.  There were 23 between the first and ninth of July (16 consultation outcomes and 7 new consulations) so maybe that’s about the going rate.

I don’t know.

It just seemed interesting to look at, that’s all.  As you were.

(But if you’re interested there are 9 open consultations about tax.  And this one closes at quarter to midnight tonight!)



April 24, 2014

I’m a bit cross, actually.  How did I miss this?  I opened my newspaper at my convention hotel on Friday morning and there were the headlines: “Borderline insane: Government plans to let HMRC sell taxpayers’ details to private companies

I actually trawled back through my blog entries from last summer, thinking the consultation must have been snuck out by stealth somehow, but no, here it is in my table of upcoming consultations, except it’s called “sharing and publishing data for public benefit” which sounds both like boring policy-wonkery and also like something innocuous and cuddly – public benefit, after all.  So, yes, I missed it entirely (it closed around the time I was deep in academia, preparing and giving my first academic paper and letting bloggery slide)  So shoot me.

The thing that baffles me, however, is how it came to be a big news story on Good Friday and Easter Saturday.  What happened?  I have trawled through as many articles as google will give me, but all I can find that’s new is the quote from David Davis about the plan being “borderline insane” and he barely mentions it on his website.  Is it publicity for the play about him (Privacy, at the Donmar) perhaps?  He certainly seems to have been amusing himself in giving interviews to coincide with the production.

However I’ve read the consultation document, the supplementary document (yes, seventy thousand businesses have already had their data handed over to Credit Reference Agencies already, using the figleaf of appointing them as agents of HMRC for the purposes of a “trial”) and the responses doc.  So I’m ready, when the next round of consultation comes out.

At the same time, though, I’ve been thinking about taxpayer confidentiality and big data in other contexts – in doing some work with the Women’s Budget group and the Ekklesia think tank, as well as following the conversation at the Guardian Public Sector blog.  Rather than make one giant tl:dr post, I think I’ll come back to it over the next week or so and unpack my thinking a bit.  Watch this space!