increase number of reducers

2015-12-16 Thread Awhan Patnaik
3 node cluster with 15 gigs of RAM per node. Two tables L is approximately 1 Million rows, U is 100 Million. They both have latitude and longitude columns. I want to find the count of rows in U that are within a 10 mile radius of each of the row in L. I have indexed the latitude and longitude colu

Re: Error

2015-12-16 Thread Jörn Franke
Do you have the create table statement? The sqoop command ? > On 17 Dec 2015, at 07:13, Trainee Bingo wrote: > > Hi All, > > I have a sqoop script which brings data from oracle and dumps it to HDFS. > Then that data is exposed to hive external table. But when I do : > hive> select * from ; >

Error

2015-12-16 Thread Trainee Bingo
Hi All, I have a sqoop script which brings data from oracle and dumps it to HDFS. Then that data is exposed to hive external table. But when I do : *hive> select * from ;* *OK* *Failed with exception java.io.IOException:java.io.IOException: Can't find sync mark in the stream* *Time taken: 1.01

complex join keys cannot be recognized in Hive 0.13

2015-12-16 Thread Xiaoyong Zhu
Hi Experts I am using Hive 0.13 and find a potential bug. Attached "implicit join.hql" has several join keys (for example store_sales.ss_addr_sk = customer_address.ca_address_sk) and cannot be regonized by Hive. In such cases hive won't be able to optimize and can only do a cross join first whi

Re: January Hive User Group Meeting

2015-12-16 Thread Xuefu Zhang
Yeah. I can try to set up a webex for this. However, I'd encourage folks to attend in person to get more live experience, especially for those from local Bay Area. Thanks, Xuefu On Wed, Dec 16, 2015 at 3:42 PM, Mich Talebzadeh wrote: > Thanks for heads up. > > > > Will it be possible to remote

RE: January Hive User Group Meeting

2015-12-16 Thread Mich Talebzadeh
Thanks for heads up. Will it be possible to remote to this meetings for live sessions? Regards, Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy: Running the most Critical Financial Data on ASE 15 http://login.sybase.com/files/Product_Overviews/ASE-Winning

January Hive User Group Meeting

2015-12-16 Thread Xuefu Zhang
Dear Hive users and developers, Hive community is considering a user group meeting[1] January 21, 2016 at Cloudera facility in Palo Alto, CA. This will be a great opportunity for vast users and developers to find out what's happening in the community and share each other's experience with Hive. Th

RE: making session setting "set spark.master=yarn-client" for Hive on Spark

2015-12-16 Thread Mich Talebzadeh
Sounds like from the following list of session settings for hive set spark.home=/usr/lib/spark-1.3.1-bin-hadoop2.6; set hive.execution.engine=spark; set spark.master=yarn-client; set spark.master=spark://50.140.197.217:7077; set spark.eventLog.enabled=true; set spark.eventLog.dir=/usr/lib/

RE: making session setting "set spark.master=yarn-client" for Hive on Spark

2015-12-16 Thread Mich Talebzadeh
Thanks. With spark.master=yarn-cluster I see much stable connections and better there is no need to start spark master on port 7077 etc. Cheers, Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy: Running the most Critical Financial Data on ASE 15 http://lo

Re: making session setting "set spark.master=yarn-client" for Hive on Spark

2015-12-16 Thread Xuefu Zhang
Mich, By switching the values for spark.master, you're basically asking Hive to use your YARN cluster rather than your spark standalone cluster. Both modes are supported besides local, local-cluster, and yarn-cluster. And yarn-cluster is the recommended mode. Thanks, Xuefu On Wed, Dec 16, 2015 a

making session setting "set spark.master=yarn-client" for Hive on Spark

2015-12-16 Thread Mich Talebzadeh
Hi, My environment: Hadoop 2.6.0 Hive 1.2.1 spark-1.3.1-bin-hadoop2.6 (downloaded from prebuild spark-1.3.1-bin-hadoop2.6.gz The Jar file used in $HIVE_HOME/lib to link Hive to spark was à spark-assembly-1.3.1-hadoop2.4.0.jar (built from the source downloaded as zipped file spark-1

make hive startup silent

2015-12-16 Thread Awhan Patnaik
When I launch hive from the command line it prints lots of settings information on to the screen, for example, LS_COLORS values, HADOOP_CLASSPATH values, many environment variables etc. How do I prevent this printing? Mind you I am not talking about the printing of the map and reduce progress which

Running Hive 1.5.1 on Spark 1.3.1, getting this error from time to time

2015-12-16 Thread Mich Talebzadeh
In stderr log page for app-20151216093143-0004/0 Exception at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat .java:224) at org.apache.hadoop.hive.ql.io.CombineHiveInput

insert data to lzo table. finally the table have no data

2015-12-16 Thread jipengz...@meilishuo.com
Hi all: Maybe i meet a hive bug. i used hive version is 0.14. when i insert data to a lzo compress table,the dest table hava no data. the hql is: insert into table meimiao_user_register_log select * from default.meimiao_user_register_log where dt<'2015-12-16'; and the run log is: Starting Job =

答复: Loading data from HDFS to hive and leading to many NULL value in hive table

2015-12-16 Thread zml张明磊
OK, I know. Thanks Mingle. 发件人: Jörn Franke [mailto:jornfra...@gmail.com] 发送时间: 2015年12月16日 15:15 收件人: user@hive.apache.org 主题: Re: Loading data from HDFS to hive and leading to many NULL value in hive table You forgot to tell Hive that the file is comma-separated. You may want to use the CSV