Re: how do i force unit test to do whole stage codegen
Thanks Koert for the kind words. That part however is easy to fix and was surprised to have seen the old style referenced (!) Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Wed, Apr 5, 2017 at 6:14 PM, Koert Kuipers <ko...@tresata.com> wrote: > its pretty much impossible to be fully up to date with spark given how fast > it moves! > > the book is a very helpful reference > > On Wed, Apr 5, 2017 at 11:15 AM, Jacek Laskowski <ja...@japila.pl> wrote: >> >> Hi, >> >> I'm very sorry for not being up to date with the current style (and >> "promoting" the old style) and am going to review that part soon. I'm very >> close to touch it again since I'm with Optimizer these days. >> >> Jacek >> >> On 5 Apr 2017 6:08 a.m., "Kazuaki Ishizaki" <ishiz...@jp.ibm.com> wrote: >>> >>> Hi, >>> The page in the URL explains the old style of physical plan output. >>> The current style adds "*" as a prefix of each operation that the >>> whole-stage codegen can be apply to. >>> >>> So, in your test case, whole-stage codegen has been already enabled!! >>> >>> FYI. I think that it is a good topic for d...@spark.apache.org. >>> >>> Kazuaki Ishizaki >>> >>> >>> >>> From:Koert Kuipers <ko...@tresata.com> >>> To:"user@spark.apache.org" <user@spark.apache.org> >>> Date:2017/04/05 05:12 >>> Subject:how do i force unit test to do whole stage codegen >>> >>> >>> >>> >>> i wrote my own expression with eval and doGenCode, but doGenCode never >>> gets called in tests. >>> >>> also as a test i ran this in a unit test: >>> spark.range(10).select('id as 'asId).where('id === 4).explain >>> according to >>> >>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html >>> this is supposed to show: >>> == Physical Plan == >>> WholeStageCodegen >>> : +- Project [id#0L AS asId#3L] >>> : +- Filter (id#0L = 4) >>> :+- Range 0, 1, 8, 10, [id#0L] >>> >>> but it doesn't. instead it shows: >>> >>> == Physical Plan == >>> *Project [id#12L AS asId#15L] >>> +- *Filter (id#12L = 4) >>> +- *Range (0, 10, step=1, splits=Some(4)) >>> >>> so i am again missing the WholeStageCodegen. any idea why? >>> >>> i create spark session for unit tests simply as: >>> val session = SparkSession.builder >>> .master("local[*]") >>> .appName("test") >>> .config("spark.sql.shuffle.partitions", 4) >>> .getOrCreate() >>> >>> > - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: how do i force unit test to do whole stage codegen
its pretty much impossible to be fully up to date with spark given how fast it moves! the book is a very helpful reference On Wed, Apr 5, 2017 at 11:15 AM, Jacek Laskowski <ja...@japila.pl> wrote: > Hi, > > I'm very sorry for not being up to date with the current style (and > "promoting" the old style) and am going to review that part soon. I'm very > close to touch it again since I'm with Optimizer these days. > > Jacek > > On 5 Apr 2017 6:08 a.m., "Kazuaki Ishizaki" <ishiz...@jp.ibm.com> wrote: > >> Hi, >> The page in the URL explains the old style of physical plan output. >> The current style adds "*" as a prefix of each operation that the >> whole-stage codegen can be apply to. >> >> So, in your test case, whole-stage codegen has been already enabled!! >> >> FYI. I think that it is a good topic for d...@spark.apache.org. >> >> Kazuaki Ishizaki >> >> >> >> From: Koert Kuipers <ko...@tresata.com> >> To:"user@spark.apache.org" <user@spark.apache.org> >> Date:2017/04/05 05:12 >> Subject:how do i force unit test to do whole stage codegen >> -- >> >> >> >> i wrote my own expression with eval and doGenCode, but doGenCode never >> gets called in tests. >> >> also as a test i ran this in a unit test: >> spark.range(10).select('id as 'asId).where('id === 4).explain >> according to >> >> *https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html* >> <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html> >> this is supposed to show: >> == Physical Plan == >> WholeStageCodegen >> : +- Project [id#0L AS asId#3L] >> : +- Filter (id#0L = 4) >> :+- Range 0, 1, 8, 10, [id#0L] >> >> but it doesn't. instead it shows: >> >> == Physical Plan == >> *Project [id#12L AS asId#15L] >> +- *Filter (id#12L = 4) >> +- *Range (0, 10, step=1, splits=Some(4)) >> >> so i am again missing the WholeStageCodegen. any idea why? >> >> i create spark session for unit tests simply as: >> val session = SparkSession.builder >> .master("local[*]") >> .appName("test") >> .config("spark.sql.shuffle.partitions", 4) >> .getOrCreate() >> >> >>
Re: how do i force unit test to do whole stage codegen
Hi, I'm very sorry for not being up to date with the current style (and "promoting" the old style) and am going to review that part soon. I'm very close to touch it again since I'm with Optimizer these days. Jacek On 5 Apr 2017 6:08 a.m., "Kazuaki Ishizaki" <ishiz...@jp.ibm.com> wrote: > Hi, > The page in the URL explains the old style of physical plan output. > The current style adds "*" as a prefix of each operation that the > whole-stage codegen can be apply to. > > So, in your test case, whole-stage codegen has been already enabled!! > > FYI. I think that it is a good topic for d...@spark.apache.org. > > Kazuaki Ishizaki > > > > From:Koert Kuipers <ko...@tresata.com> > To:"user@spark.apache.org" <user@spark.apache.org> > Date:2017/04/05 05:12 > Subject:how do i force unit test to do whole stage codegen > -- > > > > i wrote my own expression with eval and doGenCode, but doGenCode never > gets called in tests. > > also as a test i ran this in a unit test: > spark.range(10).select('id as 'asId).where('id === 4).explain > according to > > *https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html* > <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html> > this is supposed to show: > == Physical Plan == > WholeStageCodegen > : +- Project [id#0L AS asId#3L] > : +- Filter (id#0L = 4) > :+- Range 0, 1, 8, 10, [id#0L] > > but it doesn't. instead it shows: > > == Physical Plan == > *Project [id#12L AS asId#15L] > +- *Filter (id#12L = 4) > +- *Range (0, 10, step=1, splits=Some(4)) > > so i am again missing the WholeStageCodegen. any idea why? > > i create spark session for unit tests simply as: > val session = SparkSession.builder > .master("local[*]") > .appName("test") > .config("spark.sql.shuffle.partitions", 4) > .getOrCreate() > > >
Re: how do i force unit test to do whole stage codegen
got it. thats good to know. thanks! On Wed, Apr 5, 2017 at 12:07 AM, Kazuaki Ishizaki <ishiz...@jp.ibm.com> wrote: > Hi, > The page in the URL explains the old style of physical plan output. > The current style adds "*" as a prefix of each operation that the > whole-stage codegen can be apply to. > > So, in your test case, whole-stage codegen has been already enabled!! > > FYI. I think that it is a good topic for d...@spark.apache.org. > > Kazuaki Ishizaki > > > > From:Koert Kuipers <ko...@tresata.com> > To:"user@spark.apache.org" <user@spark.apache.org> > Date: 2017/04/05 05:12 > Subject:how do i force unit test to do whole stage codegen > -- > > > > i wrote my own expression with eval and doGenCode, but doGenCode never > gets called in tests. > > also as a test i ran this in a unit test: > spark.range(10).select('id as 'asId).where('id === 4).explain > according to > > *https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html* > <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html> > this is supposed to show: > == Physical Plan == > WholeStageCodegen > : +- Project [id#0L AS asId#3L] > : +- Filter (id#0L = 4) > :+- Range 0, 1, 8, 10, [id#0L] > > but it doesn't. instead it shows: > > == Physical Plan == > *Project [id#12L AS asId#15L] > +- *Filter (id#12L = 4) > +- *Range (0, 10, step=1, splits=Some(4)) > > so i am again missing the WholeStageCodegen. any idea why? > > i create spark session for unit tests simply as: > val session = SparkSession.builder > .master("local[*]") > .appName("test") > .config("spark.sql.shuffle.partitions", 4) > .getOrCreate() > > >
Re: how do i force unit test to do whole stage codegen
Hi, The page in the URL explains the old style of physical plan output. The current style adds "*" as a prefix of each operation that the whole-stage codegen can be apply to. So, in your test case, whole-stage codegen has been already enabled!! FYI. I think that it is a good topic for d...@spark.apache.org. Kazuaki Ishizaki From: Koert Kuipers <ko...@tresata.com> To: "user@spark.apache.org" <user@spark.apache.org> Date: 2017/04/05 05:12 Subject: how do i force unit test to do whole stage codegen i wrote my own expression with eval and doGenCode, but doGenCode never gets called in tests. also as a test i ran this in a unit test: spark.range(10).select('id as 'asId).where('id === 4).explain according to https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html this is supposed to show: == Physical Plan == WholeStageCodegen : +- Project [id#0L AS asId#3L] : +- Filter (id#0L = 4) :+- Range 0, 1, 8, 10, [id#0L] but it doesn't. instead it shows: == Physical Plan == *Project [id#12L AS asId#15L] +- *Filter (id#12L = 4) +- *Range (0, 10, step=1, splits=Some(4)) so i am again missing the WholeStageCodegen. any idea why? i create spark session for unit tests simply as: val session = SparkSession.builder .master("local[*]") .appName("test") .config("spark.sql.shuffle.partitions", 4) .getOrCreate()
how do i force unit test to do whole stage codegen
i wrote my own expression with eval and doGenCode, but doGenCode never gets called in tests. also as a test i ran this in a unit test: spark.range(10).select('id as 'asId).where('id === 4).explain according to https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html this is supposed to show: == Physical Plan ==WholeStageCodegen : +- Project [id#0L AS asId#3L] : +- Filter (id#0L = 4) :+- Range 0, 1, 8, 10, [id#0L] but it doesn't. instead it shows: == Physical Plan == *Project [id#12L AS asId#15L] +- *Filter (id#12L = 4) +- *Range (0, 10, step=1, splits=Some(4)) so i am again missing the WholeStageCodegen. any idea why? i create spark session for unit tests simply as: val session = SparkSession.builder .master("local[*]") .appName("test") .config("spark.sql.shuffle.partitions", 4) .getOrCreate()