Re: benefits of code gen

2017-02-10 Thread Koert Kuipers
yes agreed. however i believe nullSafeEval is not used for codegen? On Fri, Feb 10, 2017 at 4:56 PM, Michael Armbrust wrote: > Function1 is specialized, but nullSafeEval is Any => Any, so that's still > going to box in the non-codegened execution path. > > On Fri, Feb 10, 2017 at 1:32 PM, Koert

Re: benefits of code gen

2017-02-10 Thread Michael Armbrust
Function1 is specialized, but nullSafeEval is Any => Any, so that's still going to box in the non-codegened execution path. On Fri, Feb 10, 2017 at 1:32 PM, Koert Kuipers wrote: > based on that i take it that math functions would be primary beneficiaries > since they work on primitives. > > so i

Re: [Newbie] spark conf

2017-02-10 Thread Sam Elamin
yup that worked Thanks for the clarification! On Fri, Feb 10, 2017 at 9:42 PM, Marcelo Vanzin wrote: > If you place core-site.xml in $SPARK_HOME/conf, I'm pretty sure Spark > will pick it up. (Sounds like you're not running YARN, which would > require HADOOP_CONF_DIR.) > > Also this is more of

Re: [Newbie] spark conf

2017-02-10 Thread Marcelo Vanzin
If you place core-site.xml in $SPARK_HOME/conf, I'm pretty sure Spark will pick it up. (Sounds like you're not running YARN, which would require HADOOP_CONF_DIR.) Also this is more of a user@ question. On Fri, Feb 10, 2017 at 1:35 PM, Sam Elamin wrote: > Hi All, > > > really newbie question here

Re: [Newbie] spark conf

2017-02-10 Thread Sam Elamin
yeah I thought of that but the file made it seem that its environment specific rather than application specific configurations Im more interested in the best practices, would you recommend using the default conf file for this and uploading them to where the application will be running (remote clus

Re: [Newbie] spark conf

2017-02-10 Thread Reynold Xin
You can put them in spark's own conf/spark-defaults.conf file On Fri, Feb 10, 2017 at 10:35 PM, Sam Elamin wrote: > Hi All, > > > really newbie question here folks, i have properties like my aws access > and secret keys in the core-site.xml in hadoop among other properties, but > thats the only

[Newbie] spark conf

2017-02-10 Thread Sam Elamin
Hi All, really newbie question here folks, i have properties like my aws access and secret keys in the core-site.xml in hadoop among other properties, but thats the only reason I have hadoop installed which seems a bit of an overkill. Is there an equivalent of core-site.xml for spark so I dont h

Re: benefits of code gen

2017-02-10 Thread Koert Kuipers
based on that i take it that math functions would be primary beneficiaries since they work on primitives. so if i take UnaryMathExpression as an example, would i not get the same benefit if i change it to this? abstract class UnaryMathExpression(val f: Double => Double, name: String) extends Un

Re: Request for comments: Java 7 removal

2017-02-10 Thread Sean Owen
As usual I think maintenance release branches are created ad-hoc when there seems to be some demand. I personally would guess there will be at least one more 2.0.x and 2.1.x maintenance release. In that sense, yeah it's not really even the end of actively supporting a Java 7-compatible release. On

spark sql versus interactive hive versus hive

2017-02-10 Thread Saikat Kanjilal
Folks, I'm embarking on a project to build a POC around spark sql, I was wondering if anyone has experience in comparing spark sql with hive or interactive hive and data points around the types of queries suited for both, I am naively assuming that spark sql will beat hive in all queries given

Re: benefits of code gen

2017-02-10 Thread Reynold Xin
With complex types it doesn't work as well, but for primitive types the biggest benefit of whole stage codegen is that we don't even need to put the intermediate data into rows or columns anymore. They are just variables (stored in CPU registers). On Fri, Feb 10, 2017 at 8:22 PM, Koert Kuipers wr

benefits of code gen

2017-02-10 Thread Koert Kuipers
so i have been looking for a while now at all the catalyst expressions, and all the relative complex codegen going on. so first off i get the benefit of codegen to turn a bunch of chained iterators transformations into a single codegen stage for spark. that makes sense to me, because it avoids a b

Re: Request for comments: Java 7 removal

2017-02-10 Thread Denis Bolshakov
Hello Sean, Thanks for asking. >From my point of view it Ok to remove Java 7 support from Spark since 2.2 release. But as a lot of users still use java 7 could you please share your vision about bug fix releases for 2.0 and 2.1? About python 2.6 https://www.python.org/download/releases/2.6/ Pyt

Re: Driver hung and happend out of memory while writing to console progress bar

2017-02-10 Thread Ryan Blue
This isn't related to the progress bar, it just happened while in that section of code. Something else is taking memory in the driver, usually a broadcast table or something else that requires a lot of memory and happens on the driver. You should check your driver memory settings and the query pla

Request for comments: Java 7 removal

2017-02-10 Thread Sean Owen
As you have seen, there's a WIP PR to implement removal of Java 7 support: https://github.com/apache/spark/pull/16871 I have heard several +1s at https://issues.apache.org/jira/browse/SPARK-19493 but am asking for concerns too, now that there's a concrete change to review. If this goes in for 2.2