Re: Twitter Data analyse with HIVE
If you get output onto a single line it will be much easier for hive to process. On Tue, Jun 5, 2012 at 5:20 AM, Babak Bastan babak...@gmail.com wrote: Hi experts I'm very new in Hive and Hadoop and I want to create a very simple demo to analyse sample twitts like this: T 2009-06-08 21:49:37 U http://twitter.com/evion W I think data mining is awesome! T 2009-06-08 21:49:37 U http://twitter.com/hyungjin W I don’t think so. I don’t like data mining Generally is it possible to do that? but I don't know exactly from which point should I strat.Do you know any simple and clear reference to do this job? or would you please inform me (not in detail) what should I do? Thank you very much for your helps Babak
Re: Twitter Data analyse with HIVE
ok, no difference for me records in a line or not 2009-06-08 21:49:37 - http://twitter.com/ http://twitter.com/evionblablabla- I think data mining is awesome! 2009-06-08 21:49:37 - http://twitter.com/ http://twitter.com/hyungjinbliblibli - I don’t think so. I don’t like data mining How can I do that.I think that I should change my text file to hdfs file,correct? how can I do this one? Sorry I'm very new in this field :( On Tue, Jun 5, 2012 at 4:07 PM, Edward Capriolo edlinuxg...@gmail.comwrote: If you get output onto a single line it will be much easier for hive to process. On Tue, Jun 5, 2012 at 5:20 AM, Babak Bastan babak...@gmail.com wrote: Hi experts I'm very new in Hive and Hadoop and I want to create a very simple demo to analyse sample twitts like this: T 2009-06-08 21:49:37 U http://twitter.com/evion W I think data mining is awesome! T 2009-06-08 21:49:37 U http://twitter.com/hyungjin W I don’t think so. I don’t like data mining Generally is it possible to do that? but I don't know exactly from which point should I strat.Do you know any simple and clear reference to do this job? or would you please inform me (not in detail) what should I do? Thank you very much for your helps Babak
Re: Twitter Data analyse with HIVE
Hi Babak There isn't anything called hdfs files. Hdfs is just a file system that can store any type of file. You just need to transfer your file from lfs to hdfs and the following command helps you out for that hadoop fs -copyFromLocal location of file in lfs destination location in hdfs Regards Bejoy KS From: Babak Bastan babak...@gmail.com To: user@hive.apache.org Sent: Tuesday, June 5, 2012 7:54 PM Subject: Re: Twitter Data analyse with HIVE ok, no difference for me records in a line or not 2009-06-08 21:49:37 - http://twitter.com/evionblablabla- I think data mining is awesome! 2009-06-08 21:49:37 - http://twitter.com/bliblibli - I don’t think so. I don’t like data mining How can I do that.I think that I should change my text file to hdfs file,correct? how can I do this one? Sorry I'm very new in this field :( On Tue, Jun 5, 2012 at 4:07 PM, Edward Capriolo edlinuxg...@gmail.com wrote: If you get output onto a single line it will be much easier for hive to process. On Tue, Jun 5, 2012 at 5:20 AM, Babak Bastan babak...@gmail.com wrote: Hi experts I'm very new in Hive and Hadoop and I want to create a very simple demo to analyse sample twitts like this: T 2009-06-08 21:49:37 U http://twitter.com/evion W I think data mining is awesome! T 2009-06-08 21:49:37 U http://twitter.com/hyungjin W I don’t think so. I don’t like data mining Generally is it possible to do that? but I don't know exactly from which point should I strat.Do you know any simple and clear reference to do this job? or would you please inform me (not in detail) what should I do? Thank you very much for your helps Babak
Re: Twitter Data analyse with HIVE
Thank you for your answer location of file in lfs That means the location of my *txt file on my computer ? and I have no destination address in hdfs,where can I get this location? could you please write an example? On Tue, Jun 5, 2012 at 4:29 PM, Bejoy Ks bejoy...@yahoo.com wrote: Hi Babak There isn't anything called hdfs files. Hdfs is just a file system that can store any type of file. You just need to transfer your file from lfs to hdfs and the following command helps you out for that hadoop fs -copyFromLocal location of file in lfs destination location in hdfs Regards Bejoy KS -- *From:* Babak Bastan babak...@gmail.com *To:* user@hive.apache.org *Sent:* Tuesday, June 5, 2012 7:54 PM *Subject:* Re: Twitter Data analyse with HIVE ok, no difference for me records in a line or not 2009-06-08 21:49:37 - http://twitter.com/evionblablabla- I think data mining is awesome! 2009-06-08 21:49:37 - http://twitter.com/ http://twitter.com/hyungjinbliblibli - I don’t think so. I don’t like data mining How can I do that.I think that I should change my text file to hdfs file,correct? how can I do this one? Sorry I'm very new in this field :( On Tue, Jun 5, 2012 at 4:07 PM, Edward Capriolo edlinuxg...@gmail.comwrote: If you get output onto a single line it will be much easier for hive to process. On Tue, Jun 5, 2012 at 5:20 AM, Babak Bastan babak...@gmail.com wrote: Hi experts I'm very new in Hive and Hadoop and I want to create a very simple demo to analyse sample twitts like this: T 2009-06-08 21:49:37 U http://twitter.com/evion W I think data mining is awesome! T 2009-06-08 21:49:37 U http://twitter.com/hyungjin W I don’t think so. I don’t like data mining Generally is it possible to do that? but I don't know exactly from which point should I strat.Do you know any simple and clear reference to do this job? or would you please inform me (not in detail) what should I do? Thank you very much for your helps Babak
RE: Twitter Data analyse with HIVE
If you type hadoop fs -ls / it will show you the folders that currently exist on your hadoop cluster. Regards, [02AXP_4C_grad] Anurag Gulati | Lead Programmer Analyst | Disruptive Innovation - Socializing Acquisition ' (602) 537-7265 | * anurag.gul...@aexp.com mailto:bhanu.m.kuchibho...@aexp.com | * 18850 N 56th St, Phoenix, AZ 85054 http://www.mapquest.com/maps/18850+N+56th+St+Phoenix+AZ+85054-4500/? %Motto: Never neglect an opportunity for improvement. [cid:image002.png@01CD0C10.71371DE0] http://www.facebook.com/americanexpress [cid:image003.png@01CD0C10.71371DE0] http://www.foursquare.com/americanexpress [cid:image004.png@01CD0C10.71371DE0] http://www.twitter.com/americanexpress [cid:image005.png@01CD0C10.71371DE0] https://plus.google.com/114054690699015768556 [cid:image006.png@01CD0C10.71371DE0] http://www.linkedin.com/company/american-express [cid:image007.png@01CD0C10.71371DE0] http://www.youtube.com/americanexpress From: Babak Bastan [mailto:babak...@gmail.com] Sent: Tuesday, June 05, 2012 8:57 AM To: user@hive.apache.org; Bejoy Ks Subject: Re: Twitter Data analyse with HIVE Thank you for your answer location of file in lfs That means the location of my *txt file on my computer ? and I have no destination address in hdfs,where can I get this location? could you please write an example? On Tue, Jun 5, 2012 at 4:29 PM, Bejoy Ks bejoy...@yahoo.commailto:bejoy...@yahoo.com wrote: Hi Babak There isn't anything called hdfs files. Hdfs is just a file system that can store any type of file. You just need to transfer your file from lfs to hdfs and the following command helps you out for that hadoop fs -copyFromLocal location of file in lfs destination location in hdfs Regards Bejoy KS From: Babak Bastan babak...@gmail.commailto:babak...@gmail.com To: user@hive.apache.orgmailto:user@hive.apache.org Sent: Tuesday, June 5, 2012 7:54 PM Subject: Re: Twitter Data analyse with HIVE ok, no difference for me records in a line or not 2009-06-08 21:49:37 - http://twitter.com/evionblablabla- I think data mining is awesome! 2009-06-08 21:49:37 - http://twitter.com/http://twitter.com/hyungjinbliblibli - I don't think so. I don't like data mining How can I do that.I think that I should change my text file to hdfs file,correct? how can I do this one? Sorry I'm very new in this field :( On Tue, Jun 5, 2012 at 4:07 PM, Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com wrote: If you get output onto a single line it will be much easier for hive to process. On Tue, Jun 5, 2012 at 5:20 AM, Babak Bastan babak...@gmail.commailto:babak...@gmail.com wrote: Hi experts I'm very new in Hive and Hadoop and I want to create a very simple demo to analyse sample twitts like this: T 2009-06-08 21:49:37 U http://twitter.com/evion W I think data mining is awesome! T 2009-06-08 21:49:37 U http://twitter.com/hyungjin W I don't think so. I don't like data mining Generally is it possible to do that? but I don't know exactly from which point should I strat.Do you know any simple and clear reference to do this job? or would you please inform me (not in detail) what should I do? Thank you very much for your helps Babak American Express made the following annotations on Tue Jun 05 2012 09:06:15 ** This message and any attachments are solely for the intended recipient and may contain confidential or privileged information. If you are not the intended recipient, any disclosure, copying, use, or distribution of the information included in this message and any attachments is prohibited. If you have received this communication in error, please notify us by reply e-mail and immediately and permanently delete this message and any attachments. Thank you. American Express a ajouté le commentaire suivant le Tue Jun 05 2012 09:06:15 Ce courrier et toute pièce jointe qu'il contient sont réservés au seul destinataire indiqué et peuvent renfermer des renseignements confidentiels et privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, duplication, utilisation ou distribution du courrier ou de toute pièce jointe est interdite. Si vous avez reçu cette communication par erreur, veuillez nous en aviser par courrier et détruire immédiatement le courrier et les pièces jointes. Merci. ** --- inline: image001.gifinline: image002.pnginline: image003.pnginline: image004.pnginline: image005.pnginline: image006.pnginline: image007.png
Re: Twitter Data analyse with HIVE
Lfs means local file system. Hadoop fs -copyFromLocal will help to copy data from your local file system to the Hadoop distributed file system. Not sure what kind of cluster setup you have, are you running in local or pseudo distributed mode? Here is a link to get you started on hive https://cwiki.apache.org/confluence/display/Hive/GettingStarted You can specifically look for 'load data local in path' for using the local file system. And here is a link specifically regarding tweets. http://www.cloudera.com/blog/2010/12/hadoop-world-2010-tweet-analysis/ Sent from my iPad On 05-Jun-2012, at 9:27 PM, Babak Bastan babak...@gmail.com wrote: Thank you for your answer location of file in lfs That means the location of my *txt file on my computer ? and I have no destination address in hdfs,where can I get this location? could you please write an example? On Tue, Jun 5, 2012 at 4:29 PM, Bejoy Ks bejoy...@yahoo.com wrote: Hi Babak There isn't anything called hdfs files. Hdfs is just a file system that can store any type of file. You just need to transfer your file from lfs to hdfs and the following command helps you out for that hadoop fs -copyFromLocal location of file in lfs destination location in hdfs Regards Bejoy KS From: Babak Bastan babak...@gmail.com To: user@hive.apache.org Sent: Tuesday, June 5, 2012 7:54 PM Subject: Re: Twitter Data analyse with HIVE ok, no difference for me records in a line or not 2009-06-08 21:49:37 - http://twitter.com/evionblablabla- I think data mining is awesome! 2009-06-08 21:49:37 - http://twitter.com/bliblibli - I don’t think so. I don’t like data mining How can I do that.I think that I should change my text file to hdfs file,correct? how can I do this one? Sorry I'm very new in this field :( On Tue, Jun 5, 2012 at 4:07 PM, Edward Capriolo edlinuxg...@gmail.com wrote: If you get output onto a single line it will be much easier for hive to process. On Tue, Jun 5, 2012 at 5:20 AM, Babak Bastan babak...@gmail.com wrote: Hi experts I'm very new in Hive and Hadoop and I want to create a very simple demo to analyse sample twitts like this: T 2009-06-08 21:49:37 U http://twitter.com/evion W I think data mining is awesome! T 2009-06-08 21:49:37 U http://twitter.com/hyungjin W I don’t think so. I don’t like data mining Generally is it possible to do that? but I don't know exactly from which point should I strat.Do you know any simple and clear reference to do this job? or would you please inform me (not in detail) what should I do? Thank you very much for your helps Babak
Re: Twitter Data analyse with HIVE
Thank you Bejoy for your complete answer :) if I run this command: hadoop fs -ls / I get this results: drwxr-xr-x - root root 4096 2011-04-26 01:06 /var drwxrwxrwx - root root 4096 2012-06-05 18:38 /tmp drwxr-xr-x - root root 12288 2012-06-05 17:44 /etc -rw-r--r-- 1 root root 12809911 2012-06-02 09:57 /initrd.img drwxr-xr-x - root root 4340 2012-06-05 17:42 /dev drwxr-xr-x - root root 4096 2012-06-02 09:57 /boot drwxr-xr-x - root root 4096 2011-04-26 00:50 /srv drwxr-xr-x - root root 4096 2012-06-01 11:45 /user -rw-r--r-- 1 root root 12832710 2012-06-02 09:56 /initrd.img.old drwxr-xr-x - root root 4096 2012-06-02 09:52 /lib drwxr-xr-x - root root 4096 2012-06-05 12:52 /media drwxrwxrwx - root root 12288 2012-06-02 08:13 /host -rw--- 1 root root4654608 2011-06-28 23:30 /vmlinuz.old drwxr-xr-x - root root 4096 2012-06-02 09:54 /sbin drwxr-xr-x - root root 4096 2012-06-01 11:36 /babak dr-xr-xr-x - root root 0 2012-06-05 12:22 /proc *drwxr-xr-x - root root 4096 2012-05-31 22:03 /Downloads* * * What does the first column mean? I tried to make a dir in *Downloads* hadoop fs -mkdir /Downloads/TwitterData but no success and the system said: mkdir: failed to create /Downloads/TwitterData in Downloads I can't make a directory? On Tue, Jun 5, 2012 at 6:13 PM, Sonal Goyal sonalgoy...@gmail.com wrote: Lfs means local file system. Hadoop fs -copyFromLocal will help to copy data from your local file system to the Hadoop distributed file system. Not sure what kind of cluster setup you have, are you running in local or pseudo distributed mode? Here is a link to get you started on hive https://cwiki.apache.org/confluence/display/Hive/GettingStarted You can specifically look for 'load data local in path' for using the local file system. And here is a link specifically regarding tweets. http://www.cloudera.com/blog/2010/12/hadoop-world-2010-tweet-analysis/ Sent from my iPad On 05-Jun-2012, at 9:27 PM, Babak Bastan babak...@gmail.com wrote: Thank you for your answer location of file in lfs That means the location of my *txt file on my computer ? and I have no destination address in hdfs,where can I get this location? could you please write an example? On Tue, Jun 5, 2012 at 4:29 PM, Bejoy Ks bejoy...@yahoo.com wrote: Hi Babak There isn't anything called hdfs files. Hdfs is just a file system that can store any type of file. You just need to transfer your file from lfs to hdfs and the following command helps you out for that hadoop fs -copyFromLocal location of file in lfs destination location in hdfs Regards Bejoy KS -- *From:* Babak Bastan babak...@gmail.com *To:* user@hive.apache.org *Sent:* Tuesday, June 5, 2012 7:54 PM *Subject:* Re: Twitter Data analyse with HIVE ok, no difference for me records in a line or not 2009-06-08 21:49:37 - http://twitter.com/evionblablabla- I think data mining is awesome! 2009-06-08 21:49:37 - http://twitter.com/ http://twitter.com/hyungjinbliblibli - I don’t think so. I don’t like data mining How can I do that.I think that I should change my text file to hdfs file,correct? how can I do this one? Sorry I'm very new in this field :( On Tue, Jun 5, 2012 at 4:07 PM, Edward Capriolo edlinuxg...@gmail.comwrote: If you get output onto a single line it will be much easier for hive to process. On Tue, Jun 5, 2012 at 5:20 AM, Babak Bastan babak...@gmail.com wrote: Hi experts I'm very new in Hive and Hadoop and I want to create a very simple demo to analyse sample twitts like this: T 2009-06-08 21:49:37 U http://twitter.com/evion W I think data mining is awesome! T 2009-06-08 21:49:37 U http://twitter.com/hyungjin W I don’t think so. I don’t like data mining Generally is it possible to do that? but I don't know exactly from which point should I strat.Do you know any simple and clear reference to do this job? or would you please inform me (not in detail) what should I do? Thank you very much for your helps Babak
Re: Twitter Data analyse with HIVE
Hi Babak Looks like your hadoop is not configured correctly. The list gives me a pulse that it is showing lfs rather than hdfs. Have you configured your 'fs.default.name' in core-site.xl to point to hdfs:// instead of file:/// . You may need to revisit your hadoop setup. Try out the book I recommend, It is kick ass and will resolve all your queries. Regards, Bejoy KS From: Babak Bastan babak...@gmail.com To: user@hive.apache.org Sent: Tuesday, June 5, 2012 10:17 PM Subject: Re: Twitter Data analyse with HIVE Thank you Bejoy for your complete answer :) if I run this command: hadoop fs -ls / I get this results: drwxr-xr-x - root root 4096 2011-04-26 01:06 /var drwxrwxrwx - root root 4096 2012-06-05 18:38 /tmp drwxr-xr-x - root root 12288 2012-06-05 17:44 /etc -rw-r--r-- 1 root root 12809911 2012-06-02 09:57 /initrd.img drwxr-xr-x - root root 4340 2012-06-05 17:42 /dev drwxr-xr-x - root root 4096 2012-06-02 09:57 /boot drwxr-xr-x - root root 4096 2011-04-26 00:50 /srv drwxr-xr-x - root root 4096 2012-06-01 11:45 /user -rw-r--r-- 1 root root 12832710 2012-06-02 09:56 /initrd.img.old drwxr-xr-x - root root 4096 2012-06-02 09:52 /lib drwxr-xr-x - root root 4096 2012-06-05 12:52 /media drwxrwxrwx - root root 12288 2012-06-02 08:13 /host -rw--- 1 root root 4654608 2011-06-28 23:30 /vmlinuz.old drwxr-xr-x - root root 4096 2012-06-02 09:54 /sbin drwxr-xr-x - root root 4096 2012-06-01 11:36 /babak dr-xr-xr-x - root root 0 2012-06-05 12:22 /proc drwxr-xr-x - root root 4096 2012-05-31 22:03 /Downloads What does the first column mean? I tried to make a dir in Downloads hadoop fs -mkdir /Downloads/TwitterData but no success and the system said: mkdir: failed to create /Downloads/TwitterData in Downloads I can't make a directory? On Tue, Jun 5, 2012 at 6:13 PM, Sonal Goyal sonalgoy...@gmail.com wrote: Lfs means local file system. Hadoop fs -copyFromLocal will help to copy data from your local file system to the Hadoop distributed file system. Not sure what kind of cluster setup you have, are you running in local or pseudo distributed mode? Here is a link to get you started on hivehttps://cwiki.apache.org/confluence/display/Hive/GettingStarted You can specifically look for 'load data local in path' for using the local file system. And here is a link specifically regarding tweets. http://www.cloudera.com/blog/2010/12/hadoop-world-2010-tweet-analysis/ Sent from my iPad On 05-Jun-2012, at 9:27 PM, Babak Bastan babak...@gmail.com wrote: Thank you for your answer location of file in lfs That means the location of my *txt file on my computer ? and I have no destination address in hdfs,where can I get this location? could you please write an example? On Tue, Jun 5, 2012 at 4:29 PM, Bejoy Ks bejoy...@yahoo.com wrote: Hi Babak There isn't anything called hdfs files. Hdfs is just a file system that can store any type of file. You just need to transfer your file from lfs to hdfs and the following command helps you out for that hadoop fs -copyFromLocal location of file in lfs destination location in hdfs Regards Bejoy KS From: Babak Bastan babak...@gmail.com To: user@hive.apache.org Sent: Tuesday, June 5, 2012 7:54 PM Subject: Re: Twitter Data analyse with HIVE ok, no difference for me records in a line or not 2009-06-08 21:49:37 - http://twitter.com/evionblablabla- I think data mining is awesome! 2009-06-08 21:49:37 - http://twitter.com/hyungjinbliblibli - I don’t think so. I don’t like data mining How can I do that.I think that I should change my text file to hdfs file,correct? how can I do this one? Sorry I'm very new in this field :( On Tue, Jun 5, 2012 at 4:07 PM, Edward Capriolo edlinuxg...@gmail.com wrote: If you get output onto a single line it will be much easier for hive to process. On Tue, Jun 5, 2012 at 5:20 AM, Babak Bastan babak...@gmail.com wrote: Hi experts I'm very new in Hive and Hadoop and I want to create a very simple demo to analyse sample twitts like this: T 2009-06-08 21:49:37 U http://twitter.com/evion W I think data mining is awesome! T 2009-06-08 21:49:37 U http://twitter.com/hyungjin W I don’t think so. I don’t like data mining Generally is it possible to do that? but I don't know exactly from which point should I strat.Do you know any simple and clear reference to do this job? or would you please inform me (not in detail) what should I do? Thank you very much for your helps Babak