2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 39,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 14 sold-out performances for that many people to see it.

Click here to see the complete report.


A review of research ethics of internet using the Facebook Cornell Collaboration

Source: http://www.globalresearch.ca/wp-content/uploads/2014/07/Facebook-Emotional-Manipulation-400x300.jpg

As a part of my course on Social Visualization (CS 467) I had the opportunity to review the following three articles about ethical research on the internet related to the Facebook Cornell study of emotion on Facebook newsfeed:

Here is my review based on the above articles on the issue of ethics in academic – industry collaboration for social research:

Summary of Articles

The different articles all discuss about ethics of research without consent in the social media platforms specifically in context of the controversial Facebook study on contagion of emotions on Facebook. The three articles shed light on the situation from different perspectives. The guardian article discusses the whole situation in detail and talks about the negligence of IRB by all the parties involved in the study i.e. Facebook, Cornell University, PNAS journal. The WordPress article talks about another social scientist’s experience and pros of doing online research without human consent. The medium article talks from industry’s perspective of why these kind of researches continue to happen at industry level and this collaboration with academia just exposes the need to altering the policies for IRB for online research.

Key ethical points

I think the guardian article highlights how all the involved parties have tried to escape giving explanation of the whole situation and questions if the research was funded by the Army. It exposes the details in the facebook data policy which allows the company to run such experiments however the involvement of academic scientists without proper IRB is questionable. The article also discusses on of those scary situations where the nexus between corporate and academia will be looked as a way to bypass ethical research standards, which is not a good thing.

The wordpress article is by another social scientists who explains using her own previous researches that if the research doesn’t cause any harm then it should be allowed. She offers suggestions on making the research non-harmful by removing the negative sentiment aspect from the study. In her previous research the author entered various chat forums and depending on the experimental design shared their intention of doing research and allowed system to kick them out if the chat room was unwilling to participate.

The medium article talks from the perspective of a previous data scientist and current academic researcher. He advocates the need for a differentiating social media IRB policies with that of the real world scenario. Socio technical systems allow us to run very huge sized experiments with high efficiency which is not possible in the physical world experimental setting. He also details how making online research systems include a consent form and other nitty gritties of IRB requirements makes the systems unusable and reduces the participation because of people’s fear of things that can happen to their data.


According to me, the industry and academia collaboration are really useful and required if we want to do representative researches. Most of the research which happens by the academic community is on very small sample of social media data because of their lack of access. The corporate partnership if done for a more academic cause would help in getting more useful results which can be applied back to the advancement of social systems.

Also, any kind of research which may cause any kind of physical or mental pain should be highly regulated. This however, gives an opportunity to tackle this problem from a more user interface perspective as well. How can we make interfaces which don’t scare people away from participating in research and how can they still serve as mediums of communication of the way the user data will be used.

We cannot control what experiments which corporate companies run without our consent and we rarely get to even get access to their results. Most of the experimental results are used as cash generators for future. However, with the corporate and academic partnership these results can be used for not just the revenue increase of the company but also to advance human science and this in a way demonstrates the involvement of the company in question in corporate social responsibility in some way.

Source: http://www.globalresearch.ca/wp-content/uploads/2014/07/Facebook-Emotional-Manipulation-400x300.jpg
Source: http://www.globalresearch.ca/wp-content/uploads/2014/07/Facebook-Emotional-Manipulation-400×300.jpg

To conclude, I agree to the research nexus between facebook and cornell however I feel the effect of the research should have been limited to positive and neutral messages only so as not to cause any harm. To quote the wordpress article “spreading sunshine” in not unethical.

Statistical Analysis of Rahul versus Arnab

So, Arnab kept asking about Rahul’s opinion on Modi, 1984 riots and Ashok Chavan and Rahul fought him back with RTI, women empowerment and broader system related questions from his armory. This is how one of India’s recent probably “Once in a Lifetime” faceoff between 2 social media hot favorites ended. The unstoppable force versus the immovable object. I present a data based analytics of the whole proceeding.

It has been almost a week since Rahul Gandhi’s interview with Times Now journalist Arnab Goswami was published on Youtube. For those of you who haven’t seen the recent hot thing in Indian politics should spend some 1.5 hours of your time studying the psych of the Vice President of our current party. 

Since its publishing the video has garnered more than 1.7 million views and its has has been quite a viral thing in the days following the actual interview.

Youtube Video Stats. Source: https://www.youtube.com/watch?v=xB_eWW5ttaM
Youtube Video Stats. Source: https://www.youtube.com/watch?v=xB_eWW5ttaM

This has also allowed the politically engrossed Indian masses on social media to share their sentiments about Rahul Gandhi and his interview. The comments in my friend groups have been mostly funny and quite humorous. A majority of them have claimed that the interview was all about Rahul reiterating the same points over and over again. Things like “Women Empowerment”, “RTI” and “Rahul Gandhi” were supposedly some of the words which were supposed to be overused by Rahul. The social media was abuzz with memes about Rahul Gandhi [Source: http://www.india.com/whatever/rahul-gandhis-interview-with-arnab-goswami-the-best-tweets-and-jokes-9234/] and there was even a website totally dedicated to generating answers as Rahul would have given. [Source: http://engagedino.com/askrg]

Being a data scientist and a starter in text processing I decided to do a fun weekend project on the interview text and look for patterns and if they are correctly correlated to the claims people are making on social media. Another reason this interview was of particular interest to me because it bought 2 icons of Indian media together. It was like “an unstoppable force meeting an immovable object” [Source: The Dark Knight, 2008] and I am sure the people saw scales remaining balanced till the end.

I looked at the data from 3 perspectives:

  • All Data
  • Rahul’s Text
  • Arnab’s Text

This was important because I wanted to do a frequent statistical analysis and try to see if the claims on social media were correct. So I decide to answer the following research question:

“How accurate are the claims on social media about Rahul versus Arnab and what insights do they give into the personalities of the 2 involved entities ?”

With this simple question in mind I decided to first test word frequencies of all the 3 datasets and some of the preliminary results were not quite consistent with the claims and reflected the social media audience’s inclination to hang on to some catch phrases from the interview and make a whole viral campaign out of it.

Looking at all the data cumulatively the hot topics which were quite prominent during the interview were: riots, system, RTI and Gujrat. Now this is quite understandable as Arnab was trying to focus on issues like Gujrat riots and Rahul was focused on the things related to system changes as a part of his broader perspective strategy.

A more statistical result was:

All word statistics
All word statistics

However, a more interesting thing I was interested was in the number of entities mentioned. This allowed me to focus on key individuals who were mentioned during the interview. And the results I got were quite interesting. Leaving out Rahul [Rahul will be discussed in detail when studying Rahul’s text independently ;)] and Congress, the other key entities were Gujrat and Narendra Modi which is also quite evident. However, the most interesting entity which surfaced was Ashok Chavan whose name Arnab used a lot of times to extract some answers from Rahul. Also 1984 and Cambridge were entities discussed quite frequently.

All Text Entities
All Text Entities

I also constructed a network of entities which occurred together and these results reflected similar patterns. Also some people whose names were linked to the 1984 riots like Sajjan Kumar, Bhagat, Jagdish Tytler etc. were also evident from the analysis.

All Entity Network
All Entity Network

To try to find out the central theme of the interview I did topic modelling of the text and got 5 major cluster of topics which co-occurred frequently. Apart from central theme being Rahul v/s Modi and the elections of 2014, the other important but more frequent themes were hidden mostly in Rahul’s answers regarding women issues, RTI, system and the 2 riots of Gujrat and 1984. Ashok Chavan was also frequently used during interview regarding him being shielded in the Adarsh Scam.

Frequent and important topics during interview
Frequent and important topics during interview

Once finished with the overview analysis of the text corpus as a whole I decided to dig deeper and look into the individual statements given by both Rahul and Arnab. This is the interesting dataset according to me as this will give me answers to the research question I was pursuing.

On looking at Arnab’s dataset it was quite evident that he continued his style of asking very detailed questions Arnab spoke around 5071 words as compared to Rahul’s 7460. While Arnab was focused on issues like Modi, Chavan and 1984 riots; Rahul was more focused on issues like system, people, RTI and women. However, the internet memes started to get visualized when I looked at the entity results of Rahul and Arnab. While Arnab mentioned entities like Rahul, Modi, Gujrat and 1984; Rahul’s top entities included Congress, Gujrat, India and “Rahul Gandhi” [The Rock and Stone Cold Steve Austin would be amazed at the new entry to their club].  Infact Rahul used Rahul Gandhi 11 times during his statements, more than the number of times he used Modi [6] or even Ashok Chavan[3].

Rahul's Word Stats
Rahul’s Word Stats
Rahul's Entity Stats
Rahul’s Entity Stats
Arnab's Word Frequency
Arnab’s Word Frequency
Arnab's Entity Stats
Arnab’s Entity Stats

Another interesting thing I found was that BJP and AAP were very less frequently used by both individuals, especially when compared to the the number of times Modi and Congress were mentioned.
















Rahul Gandhi



Ashok Chavan



While Arnab’s questions revolved around topics related to Modi, Congress, Chavan and Riots; Rahul’s answers were mostly about RTI, System, youngsters in election with the central topic revolving around women issues. The central topics were not the most frequent ones but the ones which were most uniformly distributed in the whole conversation.

Arnab's Entity Network
Arnab’s Entity Network
Rahul's Entity Network
Rahul’s Entity Network

Rahul in his statements tried to connect Congress party to issues related to RTI, India along with focusing on its performance in various states. Rahul frequently tried to draw differences between Gujrat riots and 1984 riots. This was quite different from the entities Arnab tried to link. Arnab’s focus revolved around Modi and his comments of Shehzada about Rahul, Rahul’s performance in UP. Arnab also tried to pit Rahul against the BJP PM candidate Modi by bringing Modi’s candidature for the PM of India, quite regularly during the interview.

Arnab's Topics
Arnab’s Topics
Rahul's Topics
Rahul’s Topics


I am thankful to the Times of India website for making the whole Interview script text publicly available online. [Source: http://timesofindia.indiatimes.com/india/Rahul-Gandhis-first-interview-Full-text/articleshow/29455665.cms%5D. This made the text analysis a far more easy project to me. (After all I was not interested in spending another hour and a half trying to transcribe the whole audio).

After getting the data I had to clean it to get it into analytically state. I decided to split it into 3 separate data-sets:

  • Full text of Interview
  • Only Rahul’s Text
  • Only Arnab’s Text

I used the tool called ConText for data analysis like word stats, entity stats, network generation and topic visualizations along with some python scripts for parsing the data.  And I created the visualizations in Gephi using centrality measures for Node sizes [Degree] and Label Sizes [Betweenness] and modularity classes for node coloring. 

Interactive Charts of the images presented above along with full analytics data can be found at:



After doing this basic analysis I realize I figured out that even though the topics which were not frequent but were uniform in the discussion they became more popular in social media. Rahul’s usage of women empowerment, RTI and “Rahul Gandhi” were caught by social media enthusiasts and made viral. However, this also led to many other important topics and issues being hidden beneath this viral sharing. Key individuals like Ashok Chavan, Virbhadra and some scams which were mentioned were not caught by the social media audiences.

Another important observation was regarding evading of questions by Rahul and how less he tried to answer to the point or pointing out the individuals who he was supposed to give statement on. Even though it is a perfectly safe and good strategy to answer in a positive tone mentioning issues which one envisions; I would say that when it comes to personal interview being a bit more specific and elaborate on the questions at hand is more important. As the statistics reflected the platform appeared more to me as a means to talk more about what he is planning for the future and has done in the past than  about what are the key things at in the current political scenario. Overall the claims on social media were quite accurate.

Finally, I still feel there are lot more things which can make this analysis more useful and interesting. Some ideas I have but can’t implement because of lack of time [PhD studies ;)] are:

  • Word correlation on for each question and its corresponding answer
  • Language model for Rahul Gandhi’s answers and Arnab’s question [the latter can be done more easily because of the abundant dataset available]
  • Sentiment tracking for each entity and in what way the answer’s were given.

Fun Bites

Today only I also happened to see this quite dramatic reconstruction of the whole interview by Cyrus Broacha. I think the language model for both Rahul and Arnab would have greatly improved the video.


This article is my personal analysis and opinion on the issue at hand. I have cited sources from which I have taken the data and the tools I have used. If anyone plans to reproduce this article or my analysis on their site, please give a link back.

If you agree or disagree, or have thoughts to add to my analysis or want to answer more broader questions to my analysis related to light bulb changing labor forces and chickens crossing the streets. Please feel free to use the comment section.

Also, humor and analysis were some of the key elements I thought of while writing this piece and I am pretty sure I ended up doing the later relatively more than the former.

2013 – An Unexpected Journey

Welcome to 2014
Welcome to 2014

Live Life Like A Jive – Yes, those were the words which defined and shaped my year 2013. This was an year which included a lot of learning experiences for me. As I sit here to reminiscence the 365 days which just flew past me like a gush of fresh air, I remember the life they breathed into me as they passed by. I want to talk about a few key things which 2013 added to my life. It is going to be long but compressing such an eventful year to a byte would be very wrong on my part of celebration.


Time for me was never about the amount of hours or days spent involved in something or the other. But it was always about moments, moments which I can cherish, moments which I can learn from, moments which shape me and moments which challenge me. 2013 was a lot about creating those moments, a majority of them with people I learned to like and cherish in my life. Some of the key moments helped me imbibe the feeling of happiness, attachment, exuberance, humility and creativity.

Some moments were spent alone exploring my own self, I did this by indulging in my new found love for creative writing especially poetry. I penned a few poems in Hindi and English continuing with my efforts from last year. A few I recited to some close friends another just kept in my diary. Some I never even bothered to write them down, instead I made them into my own lullaby.

Some moments were of learning a quintessential amount of what the world has to offer me, this again was a continuation of what 2012 has springboarded me into. I continued with my efforts of learning on Coursera and finished my almost 1 academic year in coursera with 6 finished course with certificates and 1 finished but un-certified experience. I realized that the certificates were just a way to keep me focussed in trying to take the course seriously. They were not the end goal but an important force in making me learn what I was giving time to. I have been very satisfied with my indulgence in online learning platforms and want to continue it for the rest of my life.

A few moments were of exploring the world around me, not just the natural but also the man made one. I would attribute this to some amazing journeys I commited myself to undertake and was fortunate to be made a part of. Partaking the adventurous and enduring trek experience at Mullayangiri ranges was an experience to remember. Where I lived what I would have wanted in my dreams sometimes, to wander like Frodo in the Mountains and reach the end at the end of the day. It was a personal achievment for me finishing that 15 km trek and realizing that life has so much more in store for me if only I continue taking one step at a time in the forward direction. I was fortunate to top it up with some really amazing trips with my new fellowship of friends I found in Bangalore, a place I would always remember that 1 year in Bangalore as a turning point of my life till now.

Some moments of composing my jives with my old friends, to eating dinner and watching F.R.I.E.N.D.S. with my old college buddies. Going on bike rides in rainy days and all the random restaurant trips during late night hours. To my new found road side aloo paratha dabhas and some road trips in car with no random plans.

The moments of performing our choreographed sequence in front of a housefull crowd and working with one of the best teams I could have got in my first workplace out of college.

The joyous moment of finally being able to pursue my boyhood dream of doing research and along with the sad realization of leaving all that I learned and earned in an year to move forward to pursue bigger goals in my life. The realization of the fact that everything that has a beginning has an end. But I was happy that this beautiful joruney of my life ended on a high note.

The moments spend in theaters watching some of the finest bollywood and hollywood movies of the year with people I love discussing them with.

The moment of watching my first cricket match in the stadium and watching one of my inspirations Adam Gilchrist and my crush Priety Zinta at their finest.

The new moments were of spending probably one of the longest in years to come and best time with my family before I set forth to the new world.

The uncountable moments of photoboothing in various UIUC events, and countless magic moments with the new people and reuniting with some close old people in my life.

On an academic level the moments were of seeing our first reseach output being published to demonstrating our tool to a whole breed of social change agents. The slight sadness at missing the perfect 4/4 because of an A- but at the sametime what realizing what research is all about. The constant efforts towards writing my first paper which are still underway.

The proud feeling of once again starting on an idea, building and working with great a team and successfully getting accepted for our efforts. More on this one after next year March 😉

Being awed by the beauty of life like a couple kissing in their car parked at the top most turning slope of the Lombard street, SFO, the journey atop the Blackwell forest reserve, the glad you came trip and the dangerous WOLF tour. The first concert experience and the countless free stuff 😉

The many delicious meals I learned to cook and serve, and the innumerable dine-ins and eat-outs with my new friends. The late nighters and the chai latte, maggie after that. The 1 am effort to cook pudding and gifting it to my Professor =)

The long walks and bus rides, the sleep-ins and nightouts.

The feeling of being hugged by a thousand kids, dressed and dancing around like Martha – the speaking dog, while sweating like a pig inside the mask.

The snow angels and the youtube guided snowman. The eyebrow raising at the Halloween Frat party and my DIY costume.

The so much to do and yet so less time in hand. 2013 was like an alfresco ride in an Amazon forest.

A lot of moments with myself but many of those because of the people who I am glad to have shared those moments with.


There were a lot of people who made this year so fantastic for me. Right from my family’s and some people’s encouragement in helping me decide my future to my team, my manager, my advisors and my friend’s shaping every single day of my year. Some people with whom new bonds were forged and some with which old ties were strengthened. The people who were part of my journeys, my adventures and my wisdom. The people who were my guest and the ones who were my host. The new cohorts and the closer team mates. The old friends I left back in India and the new ones I made in US. The so many birthday boys and girls I had a chance to celebrate with.

And yes my own self who absorbed all this and continued on the the smile =)


All the moments and all the people were part of few major events in my life.

– My last few months at Citrix, which involved a long session of lunches, treats, trips, birthday celebrations and farewells.
– The time spend at my home in Bangalore and with friends involving a majority of the moments I remember.
– The long academic and social six month period at UIUC and USA in general.

What Next –>

The year is gone and the story is written. As the clock ticks on the new year and new day is passing by. I have new moments to create, new people to meet, new places to go, new things to learn and lot more things to DO but most importantly to continue on with my motto “Live Life Like A Jive” and as Frost said “miles to go before I sleep …”.

Happy New Year for another unexpected Journey =)

PS: I am actually going to sleep right now as it is too late and I am feeling sleepy =P

Jeevan ki vo baaten – जीवन कि वो बातें

क्या पता है तुमको,
कि दुनिया में सबसे ज्यादा,
दर्द क्या देता है ।

वो अनकही सी बातें,
जो इंसान दिल में दबा रखता है ।
अगर केह दिया होता उन लवजून को,
तो शायद ये डगर सुहानी हो जाती ।
पर न कहा हमने,
न जाने क्यूँ ।

शायद ये गुरूर था,
या फिर था डर ।
कुछ खोने का,
या खुद झुकने का ।

उन शब्दों को दबा के रखा है,
जो संजोये थे किसी और के लिए ।
कुछ उस दिलबर के लिये,
तो कुछ उस दोस्त के लिये,
कुछ थे अपने लाडले के लिये,
और कुछ थे परिवार के लिए ।

क्यों ना आज,
वो बातें हम बोल दें ।
अपने उन रिश्तों में ,
फिर से नया रस घोल दें।
वो क्षमा, वो इकरार,
वो दोस्ती का इज़हार,
वो ममता, वो प्यार ।

क्यूँ ना आज सब कुछ,
हाँ सब कुछ उड़ेल दे ।

और भर ले इस दिल को,
कुछ खुशनुमा लम्हों से ।
हाँ जी ले ये ज़िन्दगी,
इन नयी उमंगों में ।