Re: adding jars - hive on spark cdh 5.4.3

2016-01-12 Thread Ophir Etzion
fixes that? (cdh 5.5 is the same hive version as cdh5.4. is it spark related and not hive?) On Sun, Jan 10, 2016 at 9:26 AM, sandeep vura wrote: > Upgrade to CDH 5.5 for spark. It should work > > On Sat, Jan 9, 2016 at 12:17 AM, Ophir Etzion > wrote: > >> It didn't work.

sporadic `Unable to find class` with anonymous functions in udf

2016-01-12 Thread Ophir Etzion
using cdh5.4.3 (hive1.1) via HiveServer. Does anyone have a suggestion about what to do / look for? the error: org.apache.hadoop.hive.ql.parse.SemanticException: Generate Map Join Task Error: Unable to find class: com.foursquare.hadoop.hive.udf.IsDefinedUDF$$anonfun$initialize$6 Serialization tr

Re: adding jars - hive on spark cdh 5.4.3

2016-01-08 Thread Ophir Etzion
i, Jan 8, 2016 at 12:24 PM, Edward Capriolo wrote: > You can not 'add jar' input formats and serde's. They need to be part of > your auxlib. > > On Fri, Jan 8, 2016 at 12:19 PM, Ophir Etzion > wrote: > >> I tried now. still getting >> >> 16/01

Re: adding jars - hive on spark cdh 5.4.3

2016-01-08 Thread Ophir Etzion
Thanks! In certain use cases you could but forgot about the aux thing, thats probably it. On Fri, Jan 8, 2016 at 12:24 PM, Edward Capriolo wrote: > You can not 'add jar' input formats and serde's. They need to be part of > your auxlib. > > On Fri, Jan 8, 2016 at 12:1

Re: adding jars - hive on spark cdh 5.4.3

2016-01-08 Thread Ophir Etzion
your jar is of huge size, > you can pre-load the jar on all executors in a common available directory > to avoid network IO. > > On Thu, Jan 7, 2016 at 4:03 PM, Ophir Etzion wrote: > >> I' trying to add jars before running a query using hive on spark on cdh >> 5.

adding jars - hive on spark cdh 5.4.3

2016-01-07 Thread Ophir Etzion
I' trying to add jars before running a query using hive on spark on cdh 5.4.3. I've tried applying the patch in https://issues.apache.org/jira/browse/HIVE-12045 (manually as the patch is done on a different hive version) but still hasn't succeeded. did anyone manage to do ADD JAR successfully with

last_modified_time and transient_lastDdlTime - what is transient_lastDdlTime for.

2016-01-06 Thread Ophir Etzion
I want to know for each of my tables the last time it was modified. some of my tables don't have last_modified_time in the table parameters but all have transient_lastDdlTime. transient_lastDdlTime seems to be the same as last_modified_time in some of the tables I randomly cheked. what is the time

hive on spark

2015-12-18 Thread Ophir Etzion
During spark-submit when running hive on spark I get: Exception in thread "main" java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.hdfs.HftpFileSystem could not be instantiated Caused by: java.lang.IllegalAccessError: tried to access method org.apac

Re: Hive on Spark - Error: Child process exited before connecting back

2015-12-17 Thread Ophir Etzion
e designated recipient only, if you are not the intended >> recipient, you should destroy it immediately. Any information in this >> message shall not be understood as given or endorsed by Peridale Technology >> Ltd, its subsidiaries or their employees, unless expressly so stated. It

Re: Hive on Spark - Error: Child process exited before connecting back

2015-12-15 Thread Ophir Etzion
not be understood as given or endorsed by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Ltd, its subsidiaries nor their employees &g

Hive on Spark - Error: Child process exited before connecting back

2015-12-15 Thread Ophir Etzion
Hi, when trying to do Hive on Spark on CDH5.4.3 I get the following error when trying to run a simple query using spark. I've tried setting everything written here ( https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started) as well as what the cdh recommends. any one enc

trying to figure out number of MR jobs from explain output

2015-12-11 Thread Ophir Etzion
Hi, I've been trying to figure out how to know the number of MR jobs that will be ran for a hive query using the EXPLAIN output. I haven't got to a consistent method to knowing that. for example (in one of my queries, ctas query): STAGE DEPENDENCIES: Stage-1 is a root stage Stage-7 depends o