Hive Server2 thrift java client
Hi, Has anyone explore on thrift java client for HiveServer2? I have a java client in place which connects to HiveServer2 and gets table details. What I am yet to figure out is how to read the actual table content. How to get handle on table's storage descriptor. ThriftCLIServiceClient do not provide any methods to work with partitions/databases. Any pointers? Appreciate your help! MAny Thanks, Ghousia.
Re: Difference between like %A% and %a%
Just wondering about this, please let me know if you have any suggestions why we r getting these results: This query does not return any data: Query1:hive (test) select full_name from states where abbreviation like '%a%'; But this query returns data successfully: Query2:hive (test) select full_name from states where abbreviation like '%A%'; Result of Query 1: Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201305240156_0012, Tracking URL = http://ubuntu:50030/jobdetails.jsp?jobid=job_201305240156_0012 Kill Command = /home/satish/work/hadoop-1.0.4/libexec/../bin/hadoop job -kill job_201305240156_0012 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2013-05-24 03:51:04,939 Stage-1 map = 0%, reduce = 0% 2013-05-24 03:51:10,970 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.46 sec 2013-05-24 03:51:11,983 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.46 sec 2013-05-24 03:51:12,988 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.46 sec 2013-05-24 03:51:13,995 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.46 sec 2013-05-24 03:51:15,004 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.46 sec 2013-05-24 03:51:16,013 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.46 sec 2013-05-24 03:51:17,020 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 0.46 sec MapReduce Total cumulative CPU time: 460 msec Ended Job = job_201305240156_0012 MapReduce Jobs Launched: Job 0: Map: 1 Cumulative CPU: 0.46 sec HDFS Read: 848 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 460 msec OK full_name Time taken: 19.558 seconds But this query returns data successfully: hive (test) select full_name from states where abbreviation like '%A%'; Result of Query2: Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201305240156_0011, Tracking URL = http://ubuntu:50030/jobdetails.jsp?jobid=job_201305240156_0011 Kill Command = /home/satish/work/hadoop-1.0.4/libexec/../bin/hadoop job -kill job_201305240156_0011 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2013-05-24 03:50:32,163 Stage-1 map = 0%, reduce = 0% 2013-05-24 03:50:38,193 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.47 sec 2013-05-24 03:50:39,196 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.47 sec 2013-05-24 03:50:40,199 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.47 sec 2013-05-24 03:50:41,206 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.47 sec 2013-05-24 03:50:42,210 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.47 sec 2013-05-24 03:50:43,221 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.47 sec 2013-05-24 03:50:44,227 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 0.47 sec MapReduce Total cumulative CPU time: 470 msec Ended Job = job_201305240156_0011 MapReduce Jobs Launched: Job 0: Map: 1 Cumulative CPU: 0.47 sec HDFS Read: 848 HDFS Write: 115 SUCCESS Total MapReduce CPU Time Spent: 470 msec OK full_name Alabama Alaska Arizona Arkansas California Georgia Iowa Louisiana Massachusetts Pennsylvania Virginia Washington Time taken: 20.551 seconds Thanks Sai
Re: Difference between like %A% and %a%
2013/5/24 Sai Sai saigr...@yahoo.in abbreviation l unlike MySQL, string in Hive is case sensitive,so '%A%' is not equal with '%a%'. -- Jov blog: http:amutu.com/blog http://amutu.com/blog
Re: Where can we see the results of Select * from states
I have created an external table called states under a database called test, Then loaded the table successfully; The i have tried: Select * from states; It successfully executes MR and displays the results in the console but wondering where to look in hdfs to see these results. I have looked under all the dirs in filesystem for the below url but cannot see the results part file. http://localhost.localdomain:50070/dfshealth.jsp Also if i would like the results to save to a specific file from a query how to do it? For Ex: Select * from states myStates.txt ; Is there something like this. Thanks Sai
Re: Where to find the external table file in HDFS
I have created an external table states and loaded it from a file under /tmp/states.txt Then in the url: http://localhost.localdomain:50070/dfshealth.jsp I have looked to see if this file states table exists and do not see it. Just wondering if it is saved in hdfs or not. How many days will the files exist under /tmp folder. Thanks Sai
Re: Where can we see the results of Select * from states
you can write data into filesystem from query using INSERT OVERWRITE [LOCAL] DIRECTORY directory1 SELECT ... FROM ... more detail: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Writingdataintofilesystemfromqueries 2013/5/24 Sai Sai saigr...@yahoo.in I have created an external table called states under a database called test, Then loaded the table successfully; The i have tried: Select * from states; It successfully executes MR and displays the results in the console but wondering where to look in hdfs to see these results. I have looked under all the dirs in filesystem for the below url but cannot see the results part file. http://localhost.localdomain:50070/dfshealth.jsp Also if i would like the results to save to a specific file from a query how to do it? For Ex: Select * from states myStates.txt ; Is there something like this. Thanks Sai -- Jov blog: http:amutu.com/blog http://amutu.com/blog
Re: Difference between like %A% and %a%
But it should get more results for this: %a% than for %A% Please let me know if i am missing something. Thanks Sai From: Jov am...@amutu.com To: user@hive.apache.org; Sai Sai saigr...@yahoo.in Sent: Friday, 24 May 2013 4:39 PM Subject: Re: Difference between like %A% and %a% 2013/5/24 Sai Sai saigr...@yahoo.in abbreviation l unlike MySQL, string in Hive is case sensitive,so '%A%' is not equal with '%a%'. -- Jov blog: http:amutu.com/blog
Re: How to look at the metadata of the tables we have created.
Is it possible to look at the metadata of the databases/tables/views we have created in hive. Is there some thing like sysobjects in hive. Thanks Sai
Re: Difference between like %A% and %a%
I have mentioned this before, and I think this a big miss by the Hive team. Like, by default in many SQL RDBMS (like MSSQL or MYSQL) is not case sensitive. Thus when you have new users moving over to Hive, if they see a command like like they will assume similarity (like many other SQL like qualities) and thus false negatives may ensue. Even though it's different by default (I am ok with this ... I guess, my personal preference is that it matches the defaults on other systems, and outside of that (which I am, in in the end fine with, just grumbly :) ) give us the ability to set that behavior in the hive-site.xml. That way when an org realizes that it is different, and their users are all getting false negatives, they can just update the hive-site and fix the problem rather than have to include it in training that may or may not work. I've added this comment to https://issues.apache.org/jira/browse/HIVE-4070#comment-13666278 for fun. :) Please? :) On Fri, May 24, 2013 at 7:53 AM, Dean Wampler deanwamp...@gmail.com wrote: Your where clause looks at the abbreviation, requiring 'A', not the state name. You got the correct answer. On Fri, May 24, 2013 at 6:21 AM, Sai Sai saigr...@yahoo.in wrote: But it should get more results for this: %a% than for %A% Please let me know if i am missing something. Thanks Sai -- *From:* Jov am...@amutu.com *To:* user@hive.apache.org; Sai Sai saigr...@yahoo.in *Sent:* Friday, 24 May 2013 4:39 PM *Subject:* Re: Difference between like %A% and %a% 2013/5/24 Sai Sai saigr...@yahoo.in abbreviation l unlike MySQL, string in Hive is case sensitive,so '%A%' is not equal with '%a%'. -- Jov blog: http:amutu.com/blog http://amutu.com/blog -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com
Re: Difference between like %A% and %a%
Hortonworks has announced plans to make Hive more SQL compliant. I suspect bugs like this will be addressed sooner or later. It will be necessary to handle backwards compatibility, but that could be handled with a hive property that enables one or the other behaviors. On Fri, May 24, 2013 at 8:07 AM, John Omernik j...@omernik.com wrote: I have mentioned this before, and I think this a big miss by the Hive team. Like, by default in many SQL RDBMS (like MSSQL or MYSQL) is not case sensitive. Thus when you have new users moving over to Hive, if they see a command like like they will assume similarity (like many other SQL like qualities) and thus false negatives may ensue. Even though it's different by default (I am ok with this ... I guess, my personal preference is that it matches the defaults on other systems, and outside of that (which I am, in in the end fine with, just grumbly :) ) give us the ability to set that behavior in the hive-site.xml. That way when an org realizes that it is different, and their users are all getting false negatives, they can just update the hive-site and fix the problem rather than have to include it in training that may or may not work. I've added this comment to https://issues.apache.org/jira/browse/HIVE-4070#comment-13666278 for fun. :) Please? :) On Fri, May 24, 2013 at 7:53 AM, Dean Wampler deanwamp...@gmail.comwrote: Your where clause looks at the abbreviation, requiring 'A', not the state name. You got the correct answer. On Fri, May 24, 2013 at 6:21 AM, Sai Sai saigr...@yahoo.in wrote: But it should get more results for this: %a% than for %A% Please let me know if i am missing something. Thanks Sai -- *From:* Jov am...@amutu.com *To:* user@hive.apache.org; Sai Sai saigr...@yahoo.in *Sent:* Friday, 24 May 2013 4:39 PM *Subject:* Re: Difference between like %A% and %a% 2013/5/24 Sai Sai saigr...@yahoo.in abbreviation l unlike MySQL, string in Hive is case sensitive,so '%A%' is not equal with '%a%'. -- Jov blog: http:amutu.com/blog http://amutu.com/blog -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com
Re: Difference between like %A% and %a%
It is not really a bug, as must as it is the way hive is designed. https://issues.apache.org/jira/browse/HIVE-4070#comment-13666362 So there already is a 'like' and an 'rlike', mlike is a good idea. It seems like an easier UDF (low hanging fruit) type issue anyone could tackle. On Fri, May 24, 2013 at 9:16 AM, Dean Wampler deanwamp...@gmail.com wrote: Hortonworks has announced plans to make Hive more SQL compliant. I suspect bugs like this will be addressed sooner or later. It will be necessary to handle backwards compatibility, but that could be handled with a hive property that enables one or the other behaviors. On Fri, May 24, 2013 at 8:07 AM, John Omernik j...@omernik.com wrote: I have mentioned this before, and I think this a big miss by the Hive team. Like, by default in many SQL RDBMS (like MSSQL or MYSQL) is not case sensitive. Thus when you have new users moving over to Hive, if they see a command like like they will assume similarity (like many other SQL like qualities) and thus false negatives may ensue. Even though it's different by default (I am ok with this ... I guess, my personal preference is that it matches the defaults on other systems, and outside of that (which I am, in in the end fine with, just grumbly :) ) give us the ability to set that behavior in the hive-site.xml. That way when an org realizes that it is different, and their users are all getting false negatives, they can just update the hive-site and fix the problem rather than have to include it in training that may or may not work. I've added this comment to https://issues.apache.org/jira/browse/HIVE-4070#comment-13666278 for fun. :) Please? :) On Fri, May 24, 2013 at 7:53 AM, Dean Wampler deanwamp...@gmail.comwrote: Your where clause looks at the abbreviation, requiring 'A', not the state name. You got the correct answer. On Fri, May 24, 2013 at 6:21 AM, Sai Sai saigr...@yahoo.in wrote: But it should get more results for this: %a% than for %A% Please let me know if i am missing something. Thanks Sai -- *From:* Jov am...@amutu.com *To:* user@hive.apache.org; Sai Sai saigr...@yahoo.in *Sent:* Friday, 24 May 2013 4:39 PM *Subject:* Re: Difference between like %A% and %a% 2013/5/24 Sai Sai saigr...@yahoo.in abbreviation l unlike MySQL, string in Hive is case sensitive,so '%A%' is not equal with '%a%'. -- Jov blog: http:amutu.com/blog http://amutu.com/blog -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com
how to load data from SequenceFile(with Snappy compression) into hive
Hi, I had been trying to import data from a sequence-file stored in HDFS, compressed with Snappy. (the original file is a massive-log file). I had created the tables in hive-metastore(MySQL) and installed Snappy and tried several approaches: 1. gave the direct path with hdfs:// prefix 2. tried to download the file and import as a local file like LOAD DATA LOCAL INPATH 'FlumeData.1362965571811' OVERWRITE INTO TABLE recordsflume; Can somebody shed some light on how to import data from a sequenceFile to Hive? Thanks in advance. regards Ramesh
Re: Difference between like %A% and %a%
If backwards compatibility wasn't an issue, the hive code that implements LIKE could be changed to convert the fields and LIKE strings to lower case before comparing ;) Of course, there is overhead doing that. On Fri, May 24, 2013 at 9:50 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Also I am thinking that the rlike is based on regex and can be told to do case insensitive matching. On Fri, May 24, 2013 at 9:16 AM, Dean Wampler deanwamp...@gmail.comwrote: Hortonworks has announced plans to make Hive more SQL compliant. I suspect bugs like this will be addressed sooner or later. It will be necessary to handle backwards compatibility, but that could be handled with a hive property that enables one or the other behaviors. On Fri, May 24, 2013 at 8:07 AM, John Omernik j...@omernik.com wrote: I have mentioned this before, and I think this a big miss by the Hive team. Like, by default in many SQL RDBMS (like MSSQL or MYSQL) is not case sensitive. Thus when you have new users moving over to Hive, if they see a command like like they will assume similarity (like many other SQL like qualities) and thus false negatives may ensue. Even though it's different by default (I am ok with this ... I guess, my personal preference is that it matches the defaults on other systems, and outside of that (which I am, in in the end fine with, just grumbly :) ) give us the ability to set that behavior in the hive-site.xml. That way when an org realizes that it is different, and their users are all getting false negatives, they can just update the hive-site and fix the problem rather than have to include it in training that may or may not work. I've added this comment to https://issues.apache.org/jira/browse/HIVE-4070#comment-13666278 for fun. :) Please? :) On Fri, May 24, 2013 at 7:53 AM, Dean Wampler deanwamp...@gmail.comwrote: Your where clause looks at the abbreviation, requiring 'A', not the state name. You got the correct answer. On Fri, May 24, 2013 at 6:21 AM, Sai Sai saigr...@yahoo.in wrote: But it should get more results for this: %a% than for %A% Please let me know if i am missing something. Thanks Sai -- *From:* Jov am...@amutu.com *To:* user@hive.apache.org; Sai Sai saigr...@yahoo.in *Sent:* Friday, 24 May 2013 4:39 PM *Subject:* Re: Difference between like %A% and %a% 2013/5/24 Sai Sai saigr...@yahoo.in abbreviation l unlike MySQL, string in Hive is case sensitive,so '%A%' is not equal with '%a%'. -- Jov blog: http:amutu.com/blog http://amutu.com/blog -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com
Re: Difference between like %A% and %a%
It is not as simple of a problem as you think. Mysql has the same problem just most everyone uses a default charset and comparator. http://www.bluebox.net/about/blog/2009/07/mysql_encoding/ You do you account for foreign characters like the a~ etc. is that then A and less then On Fri, May 24, 2013 at 11:41 AM, Dean Wampler deanwamp...@gmail.comwrote: If backwards compatibility wasn't an issue, the hive code that implements LIKE could be changed to convert the fields and LIKE strings to lower case before comparing ;) Of course, there is overhead doing that. On Fri, May 24, 2013 at 9:50 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Also I am thinking that the rlike is based on regex and can be told to do case insensitive matching. On Fri, May 24, 2013 at 9:16 AM, Dean Wampler deanwamp...@gmail.comwrote: Hortonworks has announced plans to make Hive more SQL compliant. I suspect bugs like this will be addressed sooner or later. It will be necessary to handle backwards compatibility, but that could be handled with a hive property that enables one or the other behaviors. On Fri, May 24, 2013 at 8:07 AM, John Omernik j...@omernik.com wrote: I have mentioned this before, and I think this a big miss by the Hive team. Like, by default in many SQL RDBMS (like MSSQL or MYSQL) is not case sensitive. Thus when you have new users moving over to Hive, if they see a command like like they will assume similarity (like many other SQL like qualities) and thus false negatives may ensue. Even though it's different by default (I am ok with this ... I guess, my personal preference is that it matches the defaults on other systems, and outside of that (which I am, in in the end fine with, just grumbly :) ) give us the ability to set that behavior in the hive-site.xml. That way when an org realizes that it is different, and their users are all getting false negatives, they can just update the hive-site and fix the problem rather than have to include it in training that may or may not work. I've added this comment to https://issues.apache.org/jira/browse/HIVE-4070#comment-13666278 for fun. :) Please? :) On Fri, May 24, 2013 at 7:53 AM, Dean Wampler deanwamp...@gmail.comwrote: Your where clause looks at the abbreviation, requiring 'A', not the state name. You got the correct answer. On Fri, May 24, 2013 at 6:21 AM, Sai Sai saigr...@yahoo.in wrote: But it should get more results for this: %a% than for %A% Please let me know if i am missing something. Thanks Sai -- *From:* Jov am...@amutu.com *To:* user@hive.apache.org; Sai Sai saigr...@yahoo.in *Sent:* Friday, 24 May 2013 4:39 PM *Subject:* Re: Difference between like %A% and %a% 2013/5/24 Sai Sai saigr...@yahoo.in abbreviation l unlike MySQL, string in Hive is case sensitive,so '%A%' is not equal with '%a%'. -- Jov blog: http:amutu.com/blog http://amutu.com/blog -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com
OrcFile writing failing with multiple threads
All, I have a test application that is attempting to add rows to an OrcFile from multiple threads, however, every time I do I get exceptions with stack traces like the following: java.lang.IndexOutOfBoundsException: Index 4 is outside of 0..5 at org.apache.hadoop.hive.ql.io.orc.DynamicIntArray.get(DynamicIntArray.java:73) at org.apache.hadoop.hive.ql.io.orc.StringRedBlackTree.compareValue(StringRedBlackTree.java:55) at org.apache.hadoop.hive.ql.io.orc.RedBlackTree.add(RedBlackTree.java:192) at org.apache.hadoop.hive.ql.io.orc.RedBlackTree.add(RedBlackTree.java:199) at org.apache.hadoop.hive.ql.io.orc.RedBlackTree.add(RedBlackTree.java:300) at org.apache.hadoop.hive.ql.io.orc.StringRedBlackTree.add(StringRedBlackTree.java:45) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.write(WriterImpl.java:723) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$MapTreeWriter.write(WriterImpl.java:1093) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.write(WriterImpl.java:996) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:1450) at OrcFileTester$BigRowWriter.run(OrcFileTester.java:129) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Below is the source code for my sample app that is heavily based on the TestOrcFile test case using BigRow. Is there something I am doing wrong here, or is this a legitimate bug in the Orc writing? Thanks in advance, Andrew - Java app code follows - import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.hive.ql.io.orc.CompressionKind; import org.apache.hadoop.hive.ql.io.orc.OrcFile; import org.apache.hadoop.hive.ql.io.orc.Writer; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory; import org.apache.hadoop.io.BytesWritable; import org.apache.hadoop.io.Text; import java.io.File; import java.io.IOException; import java.util.HashMap; import java.util.Map; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.LinkedBlockingQueue; public class OrcFileTester { private Writer writer; private LinkedBlockingQueueBigRow bigRowQueue = new LinkedBlockingQueueBigRow(); public OrcFileTester(){ try{ Path workDir = new Path(System.getProperty(test.tmp.dir, target + File.separator + test + File.separator + tmp)); Configuration conf; FileSystem fs; Path testFilePath; conf = new Configuration(); fs = FileSystem.getLocal(conf); testFilePath = new Path(workDir, TestOrcFile.OrcFileTester.orc); fs.delete(testFilePath, false); ObjectInspector inspector = ObjectInspectorFactory.getReflectionObjectInspector (BigRow.class, ObjectInspectorFactory.ObjectInspectorOptions.JAVA); writer = OrcFile.createWriter(fs, testFilePath, conf, inspector, 10, CompressionKind.ZLIB, 1, 1); final ExecutorService bigRowWorkerPool = Executors.newFixedThreadPool(10); //Changing this to more than 1 causes exceptions when writing rows. for (int i = 0; i 1; i++) { bigRowWorkerPool.submit(new BigRowWriter()); } for(int i =0; i 100; i++){ if(0 == i % 2){ bigRowQueue.put(new BigRow(false, (byte) 1, (short) 1024, 65536, Long.MAX_VALUE, (float) 1.0, -15.0, bytes(0,1,2,3,4), hi,map(hey,orc))); } else{ bigRowQueue.put(new BigRow(false, null, (short) 1024, 65536, Long.MAX_VALUE, (float) 1.0, -15.0, bytes(0,1,2,3,4), hi,map(hey,orc))); } } while (!bigRowQueue.isEmpty()) { Thread.sleep(2000); } bigRowWorkerPool.shutdownNow(); }catch(Exception ex){ ex.printStackTrace(); } } public void WriteBigRow(){ } private static MapText, Text map(String... items) { MapText, Text result = new HashMapText, Text(); for(String i: items) { result.put(new Text(i), new Text(i)); } return result; } private static BytesWritable bytes(int... items) { BytesWritable result = new BytesWritable(); result.setSize(items.length); for(int i=0; i items.length; ++i) { result.getBytes()[i] = (byte) items[i]; } return
Re: Difference between like %A% and %a%
Postgres/Vertica and their ilk have ILIKE which is a case-insensitive version of LIKE, in addition to the case-sensitive LIKE. Works well having both. Cheers, Anthony On Fri, May 24, 2013 at 8:58 AM, Edward Capriolo edlinuxg...@gmail.comwrote: It is not as simple of a problem as you think. Mysql has the same problem just most everyone uses a default charset and comparator. http://www.bluebox.net/about/blog/2009/07/mysql_encoding/ You do you account for foreign characters like the a~ etc. is that then A and less then On Fri, May 24, 2013 at 11:41 AM, Dean Wampler deanwamp...@gmail.comwrote: If backwards compatibility wasn't an issue, the hive code that implements LIKE could be changed to convert the fields and LIKE strings to lower case before comparing ;) Of course, there is overhead doing that. On Fri, May 24, 2013 at 9:50 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Also I am thinking that the rlike is based on regex and can be told to do case insensitive matching. On Fri, May 24, 2013 at 9:16 AM, Dean Wampler deanwamp...@gmail.comwrote: Hortonworks has announced plans to make Hive more SQL compliant. I suspect bugs like this will be addressed sooner or later. It will be necessary to handle backwards compatibility, but that could be handled with a hive property that enables one or the other behaviors. On Fri, May 24, 2013 at 8:07 AM, John Omernik j...@omernik.com wrote: I have mentioned this before, and I think this a big miss by the Hive team. Like, by default in many SQL RDBMS (like MSSQL or MYSQL) is not case sensitive. Thus when you have new users moving over to Hive, if they see a command like like they will assume similarity (like many other SQL like qualities) and thus false negatives may ensue. Even though it's different by default (I am ok with this ... I guess, my personal preference is that it matches the defaults on other systems, and outside of that (which I am, in in the end fine with, just grumbly :) ) give us the ability to set that behavior in the hive-site.xml. That way when an org realizes that it is different, and their users are all getting false negatives, they can just update the hive-site and fix the problem rather than have to include it in training that may or may not work. I've added this comment to https://issues.apache.org/jira/browse/HIVE-4070#comment-13666278 for fun. :) Please? :) On Fri, May 24, 2013 at 7:53 AM, Dean Wampler deanwamp...@gmail.comwrote: Your where clause looks at the abbreviation, requiring 'A', not the state name. You got the correct answer. On Fri, May 24, 2013 at 6:21 AM, Sai Sai saigr...@yahoo.in wrote: But it should get more results for this: %a% than for %A% Please let me know if i am missing something. Thanks Sai -- *From:* Jov am...@amutu.com *To:* user@hive.apache.org; Sai Sai saigr...@yahoo.in *Sent:* Friday, 24 May 2013 4:39 PM *Subject:* Re: Difference between like %A% and %a% 2013/5/24 Sai Sai saigr...@yahoo.in abbreviation l unlike MySQL, string in Hive is case sensitive,so '%A%' is not equal with '%a%'. -- Jov blog: http:amutu.com/blog http://amutu.com/blog -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com
Apache Flume Properties File
Hi, I just installed Apache Flume 1.3.1 and trying to run a small example to test. Can any one suggest me how can I do this? I am going through the documentation right now. Thanks, Raj