"I don't think you could avoid this
in general, right, in any system? "
Really? nosql databases do efficient lookups(and scan) based on key and
partition. look at cassandra, hbase
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-a
Looks like this has been supported from 1.4 release :)
https://spark.apache.org/docs/1.4.1/api/scala/index.html#org.apache.spark.rdd.OrderedRDDFunctions
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item
Thanks! shall try it out.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20683.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
batch of data with .partitionBy()
using your CustomTuple hash implementation, persist it, and do not run any
operations on it which can remove it's partitioner object.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every
scenario..
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20571.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
Any thoughts, how could Spark SQL help in our scenario ?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20465.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
in this
scenario ?
Thanks,
Nitin.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20365.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20366.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e
are done by key,
irrespective the key exists in a partition or not ?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20174.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20289.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e
.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20290.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional
, irrespective
the key exists in a partition or not ?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
in a partition or not ?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20174.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
13 matches
Mail list logo