RE: Can a map function return null
In fact you can return “NULL” from your initial map and hence not resort to Optional at all From: Evo Eftimov [mailto:evo.efti...@isecc.com] Sent: Sunday, April 19, 2015 9:48 PM To: 'Steve Lewis' Cc: 'Olivier Girardot'; 'user@spark.apache.org' Subject: RE: Can a map function return null Well you can do another map to turn Optional into String as in the cases when Optional is empty you can store e.g. “NULL” as the value of the RDD element If this is not acceptable (based on the objectives of your architecture) and IF when returning plain null instead of Optional does throw Spark exception THEN as far as I am concerned, chess-mate From: Steve Lewis [mailto:lordjoe2...@gmail.com] Sent: Sunday, April 19, 2015 8:16 PM To: Evo Eftimov Cc: Olivier Girardot; user@spark.apache.org Subject: Re: Can a map function return null So you imagine something like this: JavaRDD words = ... JavaRDD< Optional> wordsFiltered = words.map(new Function>() { @Override public Optional call(String s) throws Exception { if ((s.length()) % 2 == 1) // drop strings of odd length return Optional.empty(); else return Optional.of(s); } }); That seems to return the wrong type a JavaRDD< Optional> which cannot be used as a JavaRDD which is what the next step expects On Sun, Apr 19, 2015 at 12:17 PM, Evo Eftimov wrote: I am on the move at the moment so i cant try it immediately but from previous memory / experience i think if you return plain null you will get a spark exception Anyway yiu can try it and see what happens and then ask the question If you do get exception try Optional instead of plain null Sent from Samsung Mobile Original message From: Olivier Girardot Date:2015/04/18 22:04 (GMT+00:00) To: Steve Lewis ,user@spark.apache.org Subject: Re: Can a map function return null You can return an RDD with null values inside, and afterwards filter on "item != null" In scala (or even in Java 8) you'd rather use Option/Optional, and in Scala they're directly usable from Spark. Exemple : sc.parallelize(1 to 1000).flatMap(item => if (item % 2 ==0) Some(item) else None).collect() res0: Array[Int] = Array(2, 4, 6, ) Regards, Olivier. Le sam. 18 avr. 2015 à 20:44, Steve Lewis a écrit : I find a number of cases where I have an JavaRDD and I wish to transform the data and depending on a test return 0 or one item (don't suggest a filter - the real case is more complex). So I currently do something like the following - perform a flatmap returning a list with 0 or 1 entry depending on the isUsed function. JavaRDD original = ... JavaRDD words = original.flatMap(new FlatMapFunction() { @Override public Iterable call(final Foo s) throws Exception { List ret = new ArrayList(); if(isUsed(s)) ret.add(transform(s)); return ret; // contains 0 items if isUsed is false } }); My question is can I do a map returning the transformed data and null if nothing is to be returned. as shown below - what does a Spark do with a map function returning null JavaRDD words = original.map(new MapFunction() { @Override Foo call(final Foo s) throws Exception { List ret = new ArrayList(); if(isUsed(s)) return transform(s); return null; // not used - what happens now } }); -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com
RE: Can a map function return null
Well you can do another map to turn Optional into String as in the cases when Optional is empty you can store e.g. “NULL” as the value of the RDD element If this is not acceptable (based on the objectives of your architecture) and IF when returning plain null instead of Optional does throw Spark exception THEN as far as I am concerned, chess-mate From: Steve Lewis [mailto:lordjoe2...@gmail.com] Sent: Sunday, April 19, 2015 8:16 PM To: Evo Eftimov Cc: Olivier Girardot; user@spark.apache.org Subject: Re: Can a map function return null So you imagine something like this: JavaRDD words = ... JavaRDD< Optional> wordsFiltered = words.map(new Function>() { @Override public Optional call(String s) throws Exception { if ((s.length()) % 2 == 1) // drop strings of odd length return Optional.empty(); else return Optional.of(s); } }); That seems to return the wrong type a JavaRDD< Optional> which cannot be used as a JavaRDD which is what the next step expects On Sun, Apr 19, 2015 at 12:17 PM, Evo Eftimov wrote: I am on the move at the moment so i cant try it immediately but from previous memory / experience i think if you return plain null you will get a spark exception Anyway yiu can try it and see what happens and then ask the question If you do get exception try Optional instead of plain null Sent from Samsung Mobile Original message From: Olivier Girardot Date:2015/04/18 22:04 (GMT+00:00) To: Steve Lewis ,user@spark.apache.org Subject: Re: Can a map function return null You can return an RDD with null values inside, and afterwards filter on "item != null" In scala (or even in Java 8) you'd rather use Option/Optional, and in Scala they're directly usable from Spark. Exemple : sc.parallelize(1 to 1000).flatMap(item => if (item % 2 ==0) Some(item) else None).collect() res0: Array[Int] = Array(2, 4, 6, ) Regards, Olivier. Le sam. 18 avr. 2015 à 20:44, Steve Lewis a écrit : I find a number of cases where I have an JavaRDD and I wish to transform the data and depending on a test return 0 or one item (don't suggest a filter - the real case is more complex). So I currently do something like the following - perform a flatmap returning a list with 0 or 1 entry depending on the isUsed function. JavaRDD original = ... JavaRDD words = original.flatMap(new FlatMapFunction() { @Override public Iterable call(final Foo s) throws Exception { List ret = new ArrayList(); if(isUsed(s)) ret.add(transform(s)); return ret; // contains 0 items if isUsed is false } }); My question is can I do a map returning the transformed data and null if nothing is to be returned. as shown below - what does a Spark do with a map function returning null JavaRDD words = original.map(new MapFunction() { @Override Foo call(final Foo s) throws Exception { List ret = new ArrayList(); if(isUsed(s)) return transform(s); return null; // not used - what happens now } }); -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com
Re: Can a map function return null
So you imagine something like this: JavaRDD words = ... JavaRDD< Optional> wordsFiltered = words.map(new Function>() { @Override public Optional call(String s) throws Exception { if ((s.length()) % 2 == 1) // drop strings of odd length return Optional.empty(); else return Optional.of(s); } }); That seems to return the wrong type a JavaRDD< Optional> which cannot be used as a JavaRDD which is what the next step expects On Sun, Apr 19, 2015 at 12:17 PM, Evo Eftimov wrote: > I am on the move at the moment so i cant try it immediately but from > previous memory / experience i think if you return plain null you will get > a spark exception > > Anyway yiu can try it and see what happens and then ask the question > > If you do get exception try Optional instead of plain null > > > Sent from Samsung Mobile > > > Original message > From: Olivier Girardot > Date:2015/04/18 22:04 (GMT+00:00) > To: Steve Lewis ,user@spark.apache.org > Subject: Re: Can a map function return null > > You can return an RDD with null values inside, and afterwards filter on > "item != null" > In scala (or even in Java 8) you'd rather use Option/Optional, and in > Scala they're directly usable from Spark. > Exemple : > > sc.parallelize(1 to 1000).flatMap(item => if (item % 2 ==0) Some(item) > else None).collect() > > res0: Array[Int] = Array(2, 4, 6, ) > > Regards, > > Olivier. > > Le sam. 18 avr. 2015 à 20:44, Steve Lewis a > écrit : > >> I find a number of cases where I have an JavaRDD and I wish to transform >> the data and depending on a test return 0 or one item (don't suggest a >> filter - the real case is more complex). So I currently do something like >> the following - perform a flatmap returning a list with 0 or 1 entry >> depending on the isUsed function. >> >> JavaRDD original = ... >> JavaRDD words = original.flatMap(new FlatMapFunction() { >> @Override >> public Iterable call(final Foo s) throws Exception { >> List ret = new ArrayList(); >> if(isUsed(s)) >>ret.add(transform(s)); >> return ret; // contains 0 items if isUsed is false >> } >> }); >> >> My question is can I do a map returning the transformed data and null if >> nothing is to be returned. as shown below - what does a Spark do with a map >> function returning null >> >> JavaRDD words = original.map(new MapFunction() { >> @Override >> Foo call(final Foo s) throws Exception { >> List ret = new ArrayList(); >> if(isUsed(s)) >>return transform(s); >> return null; // not used - what happens now >> } >> }); >> >> >> >> -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com
Re: Can a map function return null
I am on the move at the moment so i cant try it immediately but from previous memory / experience i think if you return plain null you will get a spark exception Anyway yiu can try it and see what happens and then ask the question If you do get exception try Optional instead of plain null Sent from Samsung Mobile Original message From: Olivier Girardot Date:2015/04/18 22:04 (GMT+00:00) To: Steve Lewis ,user@spark.apache.org Subject: Re: Can a map function return null You can return an RDD with null values inside, and afterwards filter on "item != null" In scala (or even in Java 8) you'd rather use Option/Optional, and in Scala they're directly usable from Spark. Exemple : sc.parallelize(1 to 1000).flatMap(item => if (item % 2 ==0) Some(item) else None).collect() res0: Array[Int] = Array(2, 4, 6, ) Regards, Olivier. Le sam. 18 avr. 2015 à 20:44, Steve Lewis a écrit : I find a number of cases where I have an JavaRDD and I wish to transform the data and depending on a test return 0 or one item (don't suggest a filter - the real case is more complex). So I currently do something like the following - perform a flatmap returning a list with 0 or 1 entry depending on the isUsed function. JavaRDD original = ... JavaRDD words = original.flatMap(new FlatMapFunction() { @Override public Iterable call(final Foo s) throws Exception { List ret = new ArrayList(); if(isUsed(s)) ret.add(transform(s)); return ret; // contains 0 items if isUsed is false } }); My question is can I do a map returning the transformed data and null if nothing is to be returned. as shown below - what does a Spark do with a map function returning null JavaRDD words = original.map(new MapFunction() { @Override Foo call(final Foo s) throws Exception { List ret = new ArrayList(); if(isUsed(s)) return transform(s); return null; // not used - what happens now } });
Re: Can a map function return null
You can return an RDD with null values inside, and afterwards filter on "item != null" In scala (or even in Java 8) you'd rather use Option/Optional, and in Scala they're directly usable from Spark. Exemple : sc.parallelize(1 to 1000).flatMap(item => if (item % 2 ==0) Some(item) else None).collect() res0: Array[Int] = Array(2, 4, 6, ) Regards, Olivier. Le sam. 18 avr. 2015 à 20:44, Steve Lewis a écrit : > I find a number of cases where I have an JavaRDD and I wish to transform > the data and depending on a test return 0 or one item (don't suggest a > filter - the real case is more complex). So I currently do something like > the following - perform a flatmap returning a list with 0 or 1 entry > depending on the isUsed function. > > JavaRDD original = ... > JavaRDD words = original.flatMap(new FlatMapFunction() { > @Override > public Iterable call(final Foo s) throws Exception { > List ret = new ArrayList(); > if(isUsed(s)) >ret.add(transform(s)); > return ret; // contains 0 items if isUsed is false > } > }); > > My question is can I do a map returning the transformed data and null if > nothing is to be returned. as shown below - what does a Spark do with a map > function returning null > > JavaRDD words = original.map(new MapFunction() { > @Override > Foo call(final Foo s) throws Exception { > List ret = new ArrayList(); > if(isUsed(s)) >return transform(s); > return null; // not used - what happens now > } > }); > > > >