spark driver with OOM due to org.apache.spark.status.ElementTrackingStore

2022-08-02 Thread Jason Jun
He there, We have spark driver running 24x7, and we are continiously getting OOM in spark driver every 10 days. I found org.apache.spark.status.ElementTrackingStore keep 85% of heap usage after analyzing heap dump like this image: [image: image.png] i found these parameter would be the root cause

Is spark fair scheduler is for kubernete?

2022-04-10 Thread Jason Jun
the official doc, https://spark.apache.org/docs/latest/job-scheduling.html, didn't mention that its working for kubernete cluster? Can anyone quickly answer this? TIA. Jason

sharing class between NonClosableMutableURLClassLoader and MutableURLClassLoader

2021-06-22 Thread Jason Jun
Hi there, I'm tweaking hive thrift server and spart session to provide custom sql capabilities, and i came across java.lang.ClassNotFoundException to load customer session builder. What i found is that the customer session builder is tried to load by MutableURLClassLoader. I have no idea about w

java.lang.ClassNotFoundException for custom hive authentication

2021-06-22 Thread Jason Jun
Hi there, I'm leveraging thriftserver to provide sql service, and using custom hive authentication: -- hive.server2.custom.authentication.class com.abc.ABCAuthenticationProvider I've got this error when logging into thrift server. class path was set using --jar option. I guess this

Re: How to convert InternalRow to Row.

2020-11-30 Thread Jason Jun
And the new API is : > > val encoder = RowEncoder(schema) > > val row = encoder.createDeserializer().apply(internalRow) > > > > Thanks, > > Jia Ke > > > > *From:* Wenchen Fan > *Sent:* Friday, November 27, 2020 9:32 PM > *To:* Jason Jun > *C

How to convert InternalRow to Row.

2020-11-26 Thread Jason Jun
Hi dev, i'm working on generating custom pipeline on the fly, which means I generate SparkPlan along with each node in my pipeline. So, my pipeline end up with PipeLineRelation extending BaseRelation like: case class PipeLineRelation(schema: StructType, pipeLinePlan: LogicalPlan)(@transient over

Re: Query parsing error for the join query between different database

2016-05-18 Thread JaeSung Jun
as a table alias (which you are doing). Change the alias or > place it between backticks and you should be fine. > > > 2016-05-18 23:51 GMT+02:00 JaeSung Jun : > >> It's spark 1.6.1 and hive 1.2.1 (spark-sql saying "SET >> spark.sql.hive.version=1.2.1"). >

Re: Query parsing error for the join query between different database

2016-05-18 Thread JaeSung Jun
It's spark 1.6.1 and hive 1.2.1 (spark-sql saying "SET spark.sql.hive.version=1.2.1"). Thanks On 18 May 2016 at 23:31, Ted Yu wrote: > Which release of Spark / Hive are you using ? > > Cheers > > On May 18, 2016, at 6:12 AM, JaeSung Jun wrote: > > Hi, &

Query parsing error for the join query between different database

2016-05-18 Thread JaeSung Jun
Hi, I'm working on custom data source provider, and i'm using fully qualified table name in FROM clause like following : SELECT user. uid, dept.name FROM userdb.user user, deptdb.dept WHERE user.dept_id = dept.id and i've got the following error : MismatchedTokenException(279!=26) at org.antlr.

Unit test error

2016-04-28 Thread JaeSung Jun
Hi All, I'm developing custom data source & relation provider based on spark 1.6.1. Every unit test has its own Spark Context, and it runs successfully when running one by one. But when running in sbt(sbt:test), error pops up when initializing spark contest like followings : org.apache.spark.rpc.

Does RDD[Type1, Iterable[Type2]] split into multiple partitions?

2015-12-10 Thread JaeSung Jun
Hi, I'm currently working on Iterable type of RDD, which is like : val keyValueIterableRDD[CaseClass1, Iterable[CaseClass2]] = buildRDD(...) If there is only one unique key and Iterable is big enough, would this Iterable be partitioned across all executors like followings ? (executor1) (xxx, it

unit test failure for hive query

2015-07-29 Thread JaeSung Jun
Hi, I'm working on custom sql processing on top of Spark-SQL, and i'm upgrading it along with spark 1.4.1. I've got an error regarding multiple test suites access hive meta store at the same time like : Cause: org.apache.derby.impl.jdbc.EmbedSQLException: Another instance of Derby may have alread

Re: databases currently supported by Spark SQL JDBC

2015-07-09 Thread JaeSung Jun
As long as JDBC driver is provided, any database can be used in JDBC datasource provider. you can provide driver class in options field like followings : CREATE TEMPORARY TABLE jdbcTable USING org.apache.spark.sql.jdbc OPTIOS( url "jdbc:oracle:thin:@myhost:1521:orcl" driver "oracle.jdbc.driver.Ora

Re: Can't find postgresql jdbc driver when using external datasource

2015-04-21 Thread JaeSung Jun
e > https://eradiating.wordpress.com/2015/04/17/using-spark-data-sources-to-load-data-from-postgresql/ > > --- Original Message --- > > From: "JaeSung Jun" > Sent: April 21, 2015 1:05 AM > To: dev@spark.apache.org > Subject: Can't find postgresql jdbc driver when usin

Can't find postgresql jdbc driver when using external datasource

2015-04-21 Thread JaeSung Jun
Hi, I tried to get external data base table running sitting on postgresql. i've got java.lang.ClassNotFoundException even if i added driver jar using --jars option like followings : is it class loader hierarchy problem or any idea? thanks - spark-sql --jars ../lib/postgresql-9.

Re: DDL parser class parsing DDL in spark-sql cli

2015-04-14 Thread JaeSung Jun
spark/sql/hive/HiveQl.scala> > > On Tue, Apr 14, 2015 at 7:13 AM, JaeSung Jun wrote: > >> Hi, >> >> Wile I've been walking through spark-sql source code, I typed the >> following >> HiveQL: >> >> CREATE EXTERNAL TABLE user (uid STRING, age INT

DDL parser class parsing DDL in spark-sql cli

2015-04-14 Thread JaeSung Jun
Hi, Wile I've been walking through spark-sql source code, I typed the following HiveQL: CREATE EXTERNAL TABLE user (uid STRING, age INT, gender STRING, job STRING, ts STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/hive/user'; , and I finally came across ddl.scala after analysin

Re: Tachyon in Spark

2014-12-15 Thread Jun Feng Liu
Thanks the response. I got the point - sounds like todays Spark linage dose not push to Tachyon linage. Would be good to see how it works. Jun Feng Liu. Haoyuan Li

Re: Tachyon in Spark

2014-12-12 Thread Jun Feng Liu
is not ready yet if tachyon linage dose not work right now? Best Regards Jun Feng Liu IBM China Systems & Technology Laboratory in Beijing Phone: 86-10-82452683 E-mail: liuj...@cn.ibm.com BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 C

Re: HA support for Spark

2014-12-11 Thread Jun Feng Liu
Interesting, you saying StreamContext checkpoint can regenerate DAG stuff? Best Regards Jun Feng Liu IBM China Systems & Technology Laboratory in Beijing Phone: 86-10-82452683 E-mail: liuj...@cn.ibm.com BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 10

Re: HA support for Spark

2014-12-10 Thread Jun Feng Liu
To Jun Feng Liu/China/IBM@IBMCN, 2014-12-11 01:34 cc Reynold Xin , "dev@spark.apach

Tachyon in Spark

2014-12-10 Thread Jun Feng Liu
Dose Spark today really leverage Tachyon linage to process data? It seems like the application should call createDependency function in TachyonFS to create a new linage node. But I did not find any place call that in Spark code. Did I missed anything? Best Regards Jun Feng Liu IBM China

Re: HA support for Spark

2014-12-10 Thread Jun Feng Liu
Well, it should not be mission impossible thinking there are so many HA solution existing today. I would interest to know if there is any specific difficult. Best Regards Jun Feng Liu IBM China Systems & Technology Laboratory in Beijing Phone: 86-10-82452683 E-mail: liuj...@cn.ibm

HA support for Spark

2014-12-10 Thread Jun Feng Liu
when error happen, but seems like will lost track all tasks status or even executor information that maintain in spark context. I am not sure if there is any existing stuff I can leverage to do that. thanks for any suggests Best Regards Jun Feng Liu IBM China Systems & Technology Laborator

Ooyala Spark JobServer

2014-12-04 Thread Jun Feng Liu
Hi, I am wondering the status of the Ooyala Spark Jobserver, any plan to get it into the spark release? Best Regards Jun Feng Liu IBM China Systems & Technology Laboratory in Beijing Phone: 86-10-82452683 E-mail: liuj...@cn.ibm.com BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang

Re: Spark authenticate enablement

2014-09-16 Thread Jun Feng Liu
I see. Thank you, it works for me. It looks confusing to have two ways expose configuration though. Best Regards Jun Feng Liu IBM China Systems & Technology Laboratory in Beijing Phone: 86-10-82452683 E-mail: liuj...@cn.ibm.com BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang

Spark authenticate enablement

2014-09-11 Thread Jun Feng Liu
authentication only work for like YARN model? Or I missed something with standalone model. Best Regards Jun Feng Liu IBM China Systems & Technology Laboratory in Beijing Phone: 86-10-82452683 E-mail: liuj...@cn.ibm.com BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Bei

Re: Fine-Grained Scheduler on Yarn

2014-08-08 Thread Jun Feng Liu
Regards Jun Feng Liu IBM China Systems & Technology Laboratory in Beijing Phone: 86-10-82452683 E-mail: liuj...@cn.ibm.com BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 China Sandy Ryza 2014/08/08 15:14 To Jun Feng Liu/China/IBM@IBMCN,

Re: Fine-Grained Scheduler on Yarn

2014-08-07 Thread Jun Feng Liu
Thanks for echo on this. Possible to adjust resource based on container numbers? e.g to allocate more container when driver need more resources and return some resource by delete some container when parts of container already have enough cores/memory Best Regards Jun Feng Liu IBM China

Re: Fine-Grained Scheduler on Yarn

2014-08-07 Thread Jun Feng Liu
Any one know the answer? Best Regards Jun Feng Liu IBM China Systems & Technology Laboratory in Beijing Phone: 86-10-82452683 E-mail: liuj...@cn.ibm.com BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 China Jun Feng Liu/China/IBM 2014/08/0

Re:Re: Intellij IDEA can not recognize the MLlib package

2014-08-03 Thread jun
Got it and add spark-mllib libraryDependency:libraryDependencies += "org.apache.spark" %% "spark-mllib" % "1.0.0" It works and many thanks! BR Kitaev At 2014-08-03 05:09:23, "Sean Owen" wrote: >Yes, but it is nowhere in your project dependencies. >

Re:Intellij IDEA can not recognize the MLlib package

2014-08-03 Thread jun
Sorry the color is missing. the "mllib" is red word and "import" sentence is grey.>>import org.apache.spark.mllib.recommendation.ALS At 2014-08-03 05:03:31, jun" wrote: >Hi, > > >I have started my spark exploration in intellij IDEA local model and wa

Intellij IDEA can not recognize the MLlib package

2014-08-03 Thread jun
Hi, I have started my spark exploration in intellij IDEA local model and want to focus on MLlib part. but when I put some example codes in IDEA, It can not recognize mllib package, just loos like that: > > import org.apache.spark.SparkContext >import org.apache.spark.mllib.recommendation.ALS