from:"jun"

Re: ??? INFO CreateViewCommand:57 - Try to uncache `rawCounts` before replacing.

2021-12-21 Thread Jun Zhu

awCounts" ) ? I > expected to manage spark to manage the cache automatically given that I do > not explicitly call cache(). > > > > > > How come I do not get a similar warning from? > > sampleSDF.createOrReplaceTempView( "sample" ) &

Re: Spark batch job chaining

2020-08-09 Thread Jun Zhu

Hi I am using Airflow in such scenario

Alternative for spark-redshift on scala 2.12

2020-05-05 Thread Jun Zhu

Hello Users, Is there any alternative for https://github.com/databricks/spark-redshift on scala 2.12.x? Thanks -- [image: vshapesaqua11553186012.gif] <https://vungle.com/> *Jun Zhu* Sr. Engineer I, Data ＋86 18565739171 [image: in1552694272.png] <https://www.linkedin.com/compa

Re: Spark Thriftserver on yarn, sql submit take long time.

2019-06-04 Thread Jun Zhu

://ip-172-19-104-48.ec2.internal:9083 > 19/06/04 05:58:18 INFO HiveMetaStoreClient: Opened a connection to > metastore, current connections: 1 > 19/06/04 05:58:18 INFO HiveMetaStoreClient: Connected to metastore. > 19/06/04 05:58:18 INFO RetryingMetaStoreClient: RetryingMetaStoreClient &

Spark Thriftserver on yarn, sql submit take long time.

2019-06-04 Thread Jun Zhu

(1), None)], false, > false, false > *19/06/04 05:50:15* INFO SparkExecuteStatementOperation: Result Schema: > StructType(StructField(plan,StringType,true)) Had set thrift server miniresource(10 instance) and initresource(10) on yarn. Any thought? Any config issue may relate

Re: Different query result between spark thrift server and spark-shell

2019-04-25 Thread Jun Zhu

Never mind, I got the point, spark replace hive parquet with it's own, Should set spark.sql.hive.convertMetastoreParquet=false to use hive's. Thanks On Thu, Apr 25, 2019 at 5:00 PM Jun Zhu wrote: > Hi, > We are using plugins from apache hudi which self defined a hive external &

Different query result between spark thrift server and spark-shell

2019-04-25 Thread Jun Zhu

'com.uber.hoodie.hadoop.HoodieInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 's3a://vungle2-dataeng/jun-test/stage20190424new' It works when query in spark-shell, however not in spark thrift server with same config, After debug found: spark-shell execution plan differ from

Fwd: Dose pyspark supports python3.6？

2017-11-01 Thread Jun Shi

. Thank you very much! Best, Jun

how to design the Spark application so that Shuffle data will be automatically cleaned up after some iterations

2015-09-05 Thread Jun Li

an prevent the ever-increasing of the shuffle data storage for computation that takes many iterations? Jun

Re:Re: Real-time data visualization with Zeppelin

2015-08-06 Thread jun

Hi andy, Is there any method to convert ipython notebook file(.ipynb) to spark notebook file(.snb) or vice versa? BR Jun At 2015-07-13 02:45:57, andy petrella andy.petre...@gmail.com wrote: Heya, You might be looking for something like this I guess: https://www.youtube.com/watch?v

Re: Question about Spark Streaming Receiver Failure

2015-03-16 Thread Jun Yang

of operations, then there will be a lot of shuffle data. So You need to check in the worker logs and see what happened (whether DISK full etc.), We have streaming pipelines running for weeks without having any issues. Thanks Best Regards On Mon, Mar 16, 2015 at 12:40 PM, Jun Yang yangjun...@gmail.com

Question about Spark Streaming Receiver Failure

2015-03-16 Thread Jun Yang

Guys, We have a project which builds upon Spark streaming. We use Kafka as the input stream, and create 5 receivers. When this application runs for around 90 hour, all the 5 receivers failed for some unknown reasons. In my understanding, it is not guaranteed that Spark streaming receiver will

Re: Question about Spark Streaming Receiver Failure

2015-03-16 Thread Jun Yang

On Mon, Mar 16, 2015 at 12:40 PM, Jun Yang yangjun...@gmail.com wrote: Guys, We have a project which builds upon Spark streaming. We use Kafka as the input stream, and create 5 receivers. When this application runs for around 90 hour, all the 5 receivers failed for some unknown reasons

Re: Question about Spark Streaming Receiver Failure

2015-03-16 Thread Jun Yang

spawn another receiver on another machine or on the same machine. Thanks Best Regards On Mon, Mar 16, 2015 at 1:08 PM, Jun Yang yangjun...@gmail.com wrote: Dibyendu, Thanks for the reply. I am reading your project homepage now. One quick question I care about is: If the receivers

Is It Feasible for Spark 1.1 Broadcast to Fully Utilize the Ethernet Card Throughput?

2015-01-09 Thread Jun Yang

Guys, I have a question regarding to Spark 1.1 broadcast implementation. In our pipeline, we have a large multi-class LR model, which is about 1GiB size. To employ the benefit of Spark parallelism, a natural thinking is to broadcast this model file to the worker node. However, it looks that

KafkaReceiver executor in spark streaming job on YARN suddenly killed by ResourceManager

2015-01-02 Thread Jun Ki Kim

Hi, guys I tried to run job of spark streaming with kafka on YARN. My business logic is very simple. Just listen on kafka topic and write dstream to hdfs on each batch iteration. After launching streaming job few hours, it works well. However suddenly died by ResourceManager. ResourceManager

Re: k-means clustering

2014-11-20 Thread Jun Yang

Guys, As to the questions of pre-processing, you could just migrate your logic to Spark before using K-means. I only used Scala on Spark, and haven't used Python binding on Spark, but I think the basic steps must be the same. BTW, if your data set is big with huge sparse dimension feature

Questions Regarding to MPI Program Migration to Spark

2014-11-16 Thread Jun Yang

Guys, Recently we are migrating our backend pipeline from to Spark. In our pipeline, we have a MPI-based HAC implementation, to ensure the result consistency of migration, we also want to migrate this MPI-implemented code to Spark. However, during the migration process, I found that there are

unsubscribe

2014-05-04 Thread ZHANG Jun

原始邮件主题：unsubscribe 发件人：Nabeel Memon nm3...@gmail.com 收件人：user@spark.apache.org 抄送： unsubscribe

Re: ??? INFO CreateViewCommand:57 - Try to uncache `rawCounts` before replacing.

Re: Spark batch job chaining

Alternative for spark-redshift on scala 2.12

Re: Spark Thriftserver on yarn, sql submit take long time.

Spark Thriftserver on yarn, sql submit take long time.

Re: Different query result between spark thrift server and spark-shell

Different query result between spark thrift server and spark-shell

Fwd: Dose pyspark supports python3.6？

how to design the Spark application so that Shuffle data will be automatically cleaned up after some iterations

Re:Re: Real-time data visualization with Zeppelin

Re: Question about Spark Streaming Receiver Failure

Question about Spark Streaming Receiver Failure

Re: Question about Spark Streaming Receiver Failure

Re: Question about Spark Streaming Receiver Failure

Is It Feasible for Spark 1.1 Broadcast to Fully Utilize the Ethernet Card Throughput?

KafkaReceiver executor in spark streaming job on YARN suddenly killed by ResourceManager

Re: k-means clustering

Questions Regarding to MPI Program Migration to Spark

unsubscribe

19 matches

Site Navigation

Mail list logo

Footer information