from:"Sergey Zhemzhitsky"

Re: Dynamic metric names

2019-05-06 Thread Sergey Zhemzhitsky

t, it may requires code change to > support your needs. > > Thanks > Saisai > > Sergey Zhemzhitsky 于2019年5月4日周六下午4:44写道： > >> Hello Spark Users! >> >> Just wondering whether it is possible to register a metric source without >> metrics known in advance

Dynamic metric names

2019-05-04 Thread Sergey Zhemzhitsky

Hello Spark Users! Just wondering whether it is possible to register a metric source without metrics known in advance and add the metrics themselves to this source later on? It seems that currently MetricSystem puts all the metrics from the source's MetricRegistry into a shared MetricRegistry of

Re: Accumulator guarantees

2018-05-10 Thread Sergey Zhemzhitsky

/a6fc300e91273230e7134ac6db95ccb4436c6f8f/core/src/main/scala/org/apache/spark/scheduler/Task.scala#L36 [3] https://github.com/apache/spark/blob/3990daaf3b6ca2c5a9f7790030096262efb12cb2/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1204 On Thu, May 10, 2018 at 10:24 PM, Sergey Zhemzhitsky <sz

Accumulator guarantees

2018-05-10 Thread Sergey Zhemzhitsky

Hi there, Although Spark's docs state that there is a guarantee that - accumulators in actions will only be updated once - accumulators in transformations may be updated multiple times ... I'm wondering whether the same is true for transformations in the last stage of the job or there is a

Re: AccumulatorV2 vs AccumulableParam (V1)

2018-05-04 Thread Sergey Zhemzhitsky

if flexibility is more important to them. We can keep improving > accumulator v2 without breaking backward compatibility. > > Thanks, > Wenchen > > On Thu, May 3, 2018 at 6:20 AM, Sergey Zhemzhitsky <szh.s...@gmail.com> > wrote: >> >> Hello guys, >> >&g

AccumulatorV2 vs AccumulableParam (V1)

2018-05-02 Thread Sergey Zhemzhitsky

Hello guys, I've started to migrate my Spark jobs which use Accumulators V1 to AccumulatorV2 and faced with the following issues: 1. LegacyAccumulatorWrapper now requires the resulting type of AccumulableParam to implement equals. In other case the AccumulableParam, automatically wrapped into

Re: DataFrames :: Corrupted Data

2018-03-28 Thread Sergey Zhemzhitsky

the job completes successfully On Wed, Mar 28, 2018 at 10:31 PM, Jörn Franke <jornfra...@gmail.com> wrote: > Encoding issue of the data? Eg spark uses utf-8 , but source encoding is > different? > >> On 28. Mar 2018, at 20:25, Sergey Zhemzhitsky <szh.s...@gmail.com> wrot

DataFrames :: Corrupted Data

2018-03-28 Thread Sergey Zhemzhitsky

Hello guys, I'm using Spark 2.2.0 and from time to time my job fails printing into the log the following errors scala.MatchError: profiles.total^@^@f2-a733-9304fda722ac^@^@^@^@profiles.10361.10005^@^@^@^@.total^@^@0075^@^@^@^@ scala.MatchError: pr^?files.10056.10040 (of class java.lang.String)

Best way of shipping self-contained pyspark jobs with 3rd-party dependencies

2017-12-08 Thread Sergey Zhemzhitsky

Hi PySparkers, What currently is the best way of shipping self-contained pyspark jobs with 3rd-party dependencies? There are some open JIRA issues [1], [2] as well as corresponding PRs [3], [4] and articles [5], [6], [7] regarding setting up the python environment with conda and virtualenv

Best way of shipping self-contained pyspark jobs with 3rd-party dependencies

2017-12-07 Thread Sergey Zhemzhitsky

Hi PySparkers, What currently is the best way of shipping self-contained pyspark jobs with 3rd-party dependencies? There are some open JIRA issues [1], [2] as well as corresponding PRs [3], [4] and articles [5], [6], regarding setting up the python environment with conda and virtualenv

What is the purpose of having RDD.context and RDD.sparkContext at the same time?

2017-06-27 Thread Sergey Zhemzhitsky

Hello spark gurus, Could you please shed some light on what is the purpose of having two identical functions in RDD, RDD.context [1] and RDD.sparkContext [2]. RDD.context seems to be used more frequently across the source code. [1]

Re: Is GraphX really deprecated?

2017-05-16 Thread Sergey Zhemzhitsky

lace, respectively. Jacek On 13 May 2017 3:00 p.m., "Sergey Zhemzhitsky" <szh.s...@gmail.com> wrote: > Hello Spark users, > > I just would like to know whether the GraphX component should be > considered deprecated and no longer actively maintained > and should not

Is GraphX really deprecated?

2017-05-13 Thread Sergey Zhemzhitsky

Hello Spark users, I just would like to know whether the GraphX component should be considered deprecated and no longer actively maintained and should not be considered when starting new graph-processing projects on top of Spark in favour of other graph-processing frameworks? I'm asking

Re: Dynamic metric names

Dynamic metric names

Re: Accumulator guarantees

Accumulator guarantees

Re: AccumulatorV2 vs AccumulableParam (V1)

AccumulatorV2 vs AccumulableParam (V1)

Re: DataFrames :: Corrupted Data

DataFrames :: Corrupted Data

Best way of shipping self-contained pyspark jobs with 3rd-party dependencies

Best way of shipping self-contained pyspark jobs with 3rd-party dependencies

What is the purpose of having RDD.context and RDD.sparkContext at the same time?

Re: Is GraphX really deprecated?

Is GraphX really deprecated?

13 matches

Site Navigation

Mail list logo

Footer information