Friday, June 30, 2017

Count number of lines for each PDF in a folder

This is just a note about a script which may be useful to you. This one calculates the number of lines per PDF and prints the final count.

import sys

import fnmatch
import os

matches = []
for root, dirnames, filenames in os.walk(sys.argv[1]):
   for filename in fnmatch.filter(filenames, '*.pdf'):
       matches.append(os.path.join(root, filename))

count = 0
for mat in matches:
   if not mat.lower().endswith("pdf"): continue
   cmd = "pdftk " + mat +  " dump_data | grep NumberOfPages > pn.log"
   os.system(cmd)
   try:
     f = open("pn.log")
     l = f.read().strip().split(":")[1].strip()
     f.close()
     print(mat + "," + l)
     count = int(l) + count
   except:
     continue

print(count)


Have a great weekend ! :)

Tuesday, June 06, 2017

On "The Computer's Common Sense"

Background
On the surface of it, this is a followup of blog "The Computer's Common Sense" [read here: https://rulasense.wordpress.com/2017/05/] by my friend AKD (https://twitter.com/alok_damle) who is passionate about building a new kind of intelligent system. This is also about my understanding of the machine learning tools that I have used in my work at VLife (which is now Novalead Pharma). These are the thoughts that are coming from a beginner to intermediate person with ML background, so this is more of a learning via conversation exercise for me, and more philosophically skewed rather than looking technically deep.

Artificial Intelligence vs Human Intelligence (commonly called common sense)
AKD starts of his blog with a title that makes you think a bit. It seems to equate Human Intelligence [W1] with common sense [W2]. To me however, common sense (of how uncommon it is), is one part of human intelligence, it is not the only form of intelligence that humans have. Further common sense, as the name suggests, is not something specific to an individual, but has evolved over time from a group of individuals, representing common knowledge - or to put it in other words it is "ensemble intelligence" rather than something that represents and individual humans. Thus, I feel that human intelligence is a combination of many factors - only one of which is common sense. The decisions that humans take is a cumulative effect of various factors.

RULA – Read Understand Learn Apply
If we get past that oversight, some of things being to make sense to me. The example of screw driver (https://rulasense.wordpress.com/2017/05/17/artificial-intelligence-vs-common-sense/) kind of makes sense for the current state of art on AI. It is mostly possible that no AI will suggest using your finger nails instead of screwdriver! *. But the reason for this is probably to do with other environmental factors that the human is in. The human brain, more often than not tries to correlate the present situation with the past situations it has encountered (when in isolation), or it tries to correlate with what others have discovered when being in similar situation (the common sense part). In isolation, a human brain probably works by "read (or observe) - understand - learn and apply" cycle, but that may not be the case always. The second term "understand" is kind of misnomer here - because one can short this with "read (or observe) - learn and apply", with "understanding" coming at a later stage - probably a far later stage. A lot of what we humans do probably translates to "read (or observe) - learn - apply". For instance, take any kid, he observers his parents, tries to learn from them, and then do similar things. He doesn't understand what he does till he grows up. Thus I feel, "understanding" comes after a series of reinforcement learning and application to what was observed. Evidently a lot of AI at the moment is focused on "read (or observer) - learn - apply" cycle and probably never come to the point of "understanding". Deep learning, may however be the ones that actually bring understanding to this process [W3].

Machine Learning vs Human Learning
That brings me to the next part of the blog, which is kind of generically titled. I think the core theme of this section is to bring home a point that most of the AI today is basically data driven. Human learning however can happen at a much superior pace and doesn't need as much data. This is quite true. But I think that this is rather possible because not only the human brain is one, but our brains are connected as a lot with other intelligent beings - and this collective brain power, which is essentially to a large extent what "common sense" encompasses - influences our individual brain learning capabilities. The "collective brain power" is not necessarily of humans, it would be be from any other form of intelligence behaviour - other animals, or even insects. Human brain is capable of capturing and basing its learning on information acquired by other intelligence forms. A counter point to the kids example above, is how often we find that the little ones think differently to what is previously conceived. That, I feel is because the kid's brain is kind of "disconnected" from the "collective brain power", that prompts the brain to potentially discover new ways to solve a problem - which an adult's brain just defaults to "common sense" part.

AI at the moment is limited to what humans feed it with. It doesn't have unrestricted access to the environment outside - as we humans have. Whether that is a shortcoming of current AI or if the AI as is implemented today needs fundamental rethink is what is yet to be seen. AKD thinks that there is an alternate way that is not yet explored. I await to see what is that.


NOTES:
* I am not sure how IBM Watson[R1] will respond - because Watson is a totally different take and at edge of AI research today, and that it could beat humans in the game of Jeopardy! is anything but amazing.

References:
R1) IBM Watson: https://www.ibm.com/watson/
R2) L. Deng, G. Tur, X. He, and D. Hakkani-Tur. "Use of Kernel Deep Convex Networks and End-To-End Learning for Spoken Language Understanding," Proc. IEEE Workshop on Spoken Language Technologies, 2012

Wikipedia:
W1) Human Intelligence https://en.wikipedia.org/wiki/Human_intelligence
W2) Common Sense https://en.wikipedia.org/wiki/Common_sense
W3) IBM Watson https://en.wikipedia.org/wiki/Watson_(computer)

Friday, June 02, 2017

Serval Project: Carrier independent network

Almost 5 years back, while putting in an my idea of building a mobile experience for myself, I had suggested a carrier independent network is what I want - something what will not only distribute the need to create infrastructure but also free us from lousy carrier plans and create a world where communication is free between humans where ever they are. [Ref: http://tovganesh.blogspot.in/2012/01/kosh-building-mobile-user-experience.html]. Obviously carrier based / satellite systems are necessary for emergency situations - but our reliance on them could definitely be minimised.

So when I saw the Serval Project (http://www.servalproject.org/), I was pleasantly surprised that they have exactly the same goal. More over, instead of building a whole new OS as I earlier proposed they are going for a more practical solution of putting it in an Android app. This is a big shout out to you guys developing the Serval Project. It is like a moment when you feel that you are not alone in the thoughts you have about how to make things different in this world. I have just installed the app on my secondary Android device and tested it with a friend. Though the interface is quite primitive at this stage, and the call quality not upto mark - it works. It is experimental and yes, it will improve.

After digging a bit into history of Serval Project (http://developer.servalproject.org/dokuwiki/doku.php?id=content:about), I discovered that it was proposed almost 2 years before I had written the above mentioned article and an early system was used for emergency response during Haiti Earthquake.

The Serval Project is also opensource (https://github.com/servalproject) and in the coming days I plan to explore this project more in depth to see if I can contribute in some way here.

Meanwhile any one should be able to install the app from Google Play (https://play.google.com/store/apps/details?id=org.servalproject), and be the part of experiment and the quest to build a carrier independent backup network.