repartition vs partitionby
Hi folks I need to reparation large set of data around(300G) as i see some portions have large data(data skew) i have pairRDDs [({},{}),({},{}),({},{})] what is the best way to solve the the problem - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Build Failure
hi I tried to build latest master branch of spark build/mvn -DskipTests clean package Reactor Summary: [INFO] [INFO] Spark Project Parent POM ... SUCCESS [03:46 min] [INFO] Spark Project Test Tags SUCCESS [01:02 min] [INFO] Spark Project Launcher . SUCCESS [01:03 min] [INFO] Spark Project Networking ... SUCCESS [ 30.794 s] [INFO] Spark Project Shuffle Streaming Service SUCCESS [ 29.496 s] [INFO] Spark Project Unsafe ... SUCCESS [ 18.478 s] [INFO] Spark Project Core . SUCCESS [05:42 min] [INFO] Spark Project Bagel SUCCESS [ 6.082 s] [INFO] Spark Project GraphX ... SUCCESS [ 23.478 s] [INFO] Spark Project Streaming SUCCESS [ 53.969 s] [INFO] Spark Project Catalyst . SUCCESS [02:12 min] [INFO] Spark Project SQL .. SUCCESS [03:02 min] [INFO] Spark Project ML Library ... SUCCESS [02:57 min] [INFO] Spark Project Tools SUCCESS [ 3.139 s] [INFO] Spark Project Hive . SUCCESS [03:25 min] [INFO] Spark Project REPL . SUCCESS [ 18.303 s] [INFO] Spark Project Assembly . SUCCESS [01:40 min] [INFO] Spark Project External Twitter . SUCCESS [ 16.707 s] [INFO] Spark Project External Flume Sink .. SUCCESS [ 52.234 s] [INFO] Spark Project External Flume ... SUCCESS [ 13.069 s] [INFO] Spark Project External Flume Assembly .. SUCCESS [ 4.653 s] [INFO] Spark Project External MQTT SUCCESS [01:56 min] [INFO] Spark Project External MQTT Assembly ... SUCCESS [ 15.233 s] [INFO] Spark Project External ZeroMQ .. SUCCESS [ 13.267 s] [INFO] Spark Project External Kafka ... SUCCESS [ 41.663 s] [INFO] Spark Project Examples . FAILURE [07:36 min] [INFO] Spark Project External Kafka Assembly .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 40:07 min [INFO] Finished at: 2015-10-08T13:14:31+05:30 [INFO] Final Memory: 373M/1205M [INFO] [ERROR] Failed to execute goal on project spark-examples_2.10: Could not resolve dependencies for project org.apache.spark:spark-examples_2.10:jar:1.6.0-SNAPSHOT: The following artifacts could not be resolved: com.twitter:algebird-core_2.10:jar:0.9.0, com.github.stephenc:jamm:jar:0.2.5: Could not transfer artifact com.twitter:algebird-core_2.10:jar:0.9.0 from/to central (https://repo1.maven.org/maven2): GET request of: com/twitter/algebird-core_2.10/0.9.0/algebird-core_2.10-0.9.0.jar from central failed: Connection reset -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :spark-examples_2.10 - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
API to run spark Jobs
Hi Folks How i can submit my spark app(python) to the cluster without using spark-submit, actually i need to invoke jobs from UI - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: API to run spark Jobs
hi Jeff Thanks More specifically i need the Rest api to submit pyspark job, can you point me to Spark submit REST api > On Oct 6, 2015, at 10:25 PM, Jeff Nadler <jnad...@srcginc.com> wrote: > > > Spark standalone doesn't come with a UI for submitting jobs. Some Hadoop > distros might, for example EMR in AWS has a job submit UI. > > Spark submit just calls a REST api, you could build any UI you want on top of > that... > > > On Tue, Oct 6, 2015 at 9:37 AM, shahid qadri <shahidashr...@icloud.com > <mailto:shahidashr...@icloud.com>> wrote: > Hi Folks > > How i can submit my spark app(python) to the cluster without using > spark-submit, actually i need to invoke jobs from UI > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org> > For additional commands, e-mail: user-h...@spark.apache.org > <mailto:user-h...@spark.apache.org> > >
Custom Partitioner
Hi Sparkians How can we create a customer partition in pyspark - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How to effieciently write sorted neighborhood in pyspark
> On Aug 25, 2015, at 10:43 PM, shahid qadri <shahidashr...@icloud.com> wrote: > > Any resources on this > >> On Aug 25, 2015, at 3:15 PM, shahid qadri <shahidashr...@icloud.com> wrote: >> >> I would like to implement sorted neighborhood approach in spark, what is the >> best way to write that in pyspark. > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How to effieciently write sorted neighborhood in pyspark
Any resources on this On Aug 25, 2015, at 3:15 PM, shahid qadri shahidashr...@icloud.com wrote: I would like to implement sorted neighborhood approach in spark, what is the best way to write that in pyspark. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
How to effieciently write sorted neighborhood in pyspark
I would like to implement sorted neighborhood approach in spark, what is the best way to write that in pyspark. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
disabling dynamic date time formatting in python api or globally
guys getting this error raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) RequestError: TransportError(400, u'MapperParsingException[failed to parse [SOURCES.DATE_COMP]]; nested: MapperParsingException[failed to parse date field [--], tried both date format [dateOptionalTime], and timestamp number with locale []]; nested: IllegalArgumentException[Invalid format: --]; ') -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b5c56a9a-edea-4006-a8f3-9b676b8f7b45%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: disabling dynamic date time formatting in python api or globally
i know i just want to disable automatic date conversion as my date strings are not in valid format(they can be empty as well) On Sunday, February 15, 2015 at 7:29:59 PM UTC+5:30, David Pilato wrote: Sounds like -- is not a valid date. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 15 févr. 2015 à 14:44, Shahid Qadri sha...@trialx.com javascript: a écrit : guys getting this error raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) RequestError: TransportError(400, u'MapperParsingException[failed to parse [SOURCES.DATE_COMP]]; nested: MapperParsingException[failed to parse date field [--], tried both date format [dateOptionalTime], and timestamp number with locale []]; nested: IllegalArgumentException[Invalid format: --]; ') -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b5c56a9a-edea-4006-a8f3-9b676b8f7b45%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/b5c56a9a-edea-4006-a8f3-9b676b8f7b45%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/982a26f7-fa99-45ce-a9cf-07b8f9ea86a7%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.