Re: [Structured Streaming] NullPointerException in long running query

2020-04-27 Thread Jungtaek Lim
The root cause of exception is occurred in executor side "Lost task 10.3 in stage 1.0 (TID 81, spark6, executor 1)" so you may need to check there. On Tue, Apr 28, 2020 at 2:52 PM lec ssmi wrote: > Hi: > One of my long-running queries occasionally encountered the following > exception: > > >

[Structured Streaming] NullPointerException in long running query

2020-04-27 Thread lec ssmi
Hi: One of my long-running queries occasionally encountered the following exception: Caused by: org.apache.spark.SparkException: Job aborted due to stage > failure: Task 10 in stage 1.0 failed 4 times, most recent failure: Lost > task 10.3 in stage 1.0 (TID 81, spark6, executor 1): > java.lan

Fwd: [Announcement] Analytics Zoo 0.8 release

2020-04-27 Thread Jason Dai
FYI :-) -- Forwarded message - From: Jason Dai Date: Tue, Apr 28, 2020 at 10:31 AM Subject: [Announcement] Analytics Zoo 0.8 release To: BigDL User Group Hi all, We are happy to announce the 0.8 release of Analytics Zoo , a u

[Announcement] Analytics Zoo 0.8 release

2020-04-27 Thread Jason Dai
Hi all, We are happy to announce the 0.8 release of Analytics Zoo , a unified Data Analytics and AI platform for *distributed TensorFlow, Keras, PyTorch, BigDL, Apache Spark/Flink and Ray**. S*ome of the notable new features in this release are:

SparkLauncher reliability and scalability

2020-04-27 Thread mhd wrk
We are using SparkLauncher and SparkAppHandle.Listener to launch spark applications from a Java web application and listen to the state changes. Our observation is that as the number of concurrent jobs grow sometimes some of the state changes are not reported (e.g. some applications never report fi

Unsubscribe

2020-04-27 Thread Natalie Ruiz
Unsubscribe Téléchargez Outlook pour iOS

Re: [pyspark] Load a master data file to spark ecosystem

2020-04-27 Thread Arjun Chundiran
Below is the reason, why I didn't use dataframes directly As per my understanding, While creating the data frame, SPARK creates the file into partitions and make it distributed. But my tree file contains the data structured in radix tree format. tree_lookup_value is the method which we use to look

Re: [pyspark] Load a master data file to spark ecosystem

2020-04-27 Thread Arjun Chundiran
Hi Gourav, I am first creating rdds and converting it into dataframes, since I need to map the value from my tree file while making the data frames Thanks, Arjun On Sun, Apr 26, 2020 at 9:33 PM Gourav Sengupta wrote: > Hi, > > Why are you using RDDs? And how are the files stored in terms if >

Re: [pyspark] Load a master data file to spark ecosystem

2020-04-27 Thread Arjun Chundiran
Hi Roland, As per my understanding, While creating the data frame, SPARK creates the file into partitions and make it distributed. But my tree file contains the data structured in radix tree format. tree_lookup_value is the method which we use to look up for a specific key in that tree. So I don't

Re: [pyspark] Load a master data file to spark ecosystem

2020-04-27 Thread Arjun Chundiran
Hi Sonal, The tree file is a file in radix tree format. tree_lookup_value is a function which looks up the value for a particular value in key. Thanks, Arjun On Sat, Apr 25, 2020 at 10:28 AM Sonal Goyal wrote: > How does your tree_lookup_value function work? > > Thanks, > Sonal > Nube Technolo

unsubscribe

2020-04-27 Thread Hongbin Liu
unsubscribe This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. Thi

Reading Hadoop Archive from Spark

2020-04-27 Thread To Quoc Cuong
Hello, After archiving parquets into a HAR (Hadoop Archive) file, its data format has the following layout:foo.har/_masterindex //stores hashes and offsets foo.har/_index //stores file statuses foo.har/part-[0..n] //stores actual parquet files combined in sequential So, we can access parquet fi