t, it may requires code change to
> support your needs.
>
> Thanks
> Saisai
>
> Sergey Zhemzhitsky 于2019年5月4日周六 下午4:44写道:
>
>> Hello Spark Users!
>>
>> Just wondering whether it is possible to register a metric source without
>> metrics known in advance
Hello Spark Users!
Just wondering whether it is possible to register a metric source without
metrics known in advance and add the metrics themselves to this source
later on?
It seems that currently MetricSystem puts all the metrics from the source's
MetricRegistry into a shared MetricRegistry of
/a6fc300e91273230e7134ac6db95ccb4436c6f8f/core/src/main/scala/org/apache/spark/scheduler/Task.scala#L36
[3]
https://github.com/apache/spark/blob/3990daaf3b6ca2c5a9f7790030096262efb12cb2/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1204
On Thu, May 10, 2018 at 10:24 PM, Sergey Zhemzhitsky <sz
Hi there,
Although Spark's docs state that there is a guarantee that
- accumulators in actions will only be updated once
- accumulators in transformations may be updated multiple times
... I'm wondering whether the same is true for transformations in the
last stage of the job or there is a
if flexibility is more important to them. We can keep improving
> accumulator v2 without breaking backward compatibility.
>
> Thanks,
> Wenchen
>
> On Thu, May 3, 2018 at 6:20 AM, Sergey Zhemzhitsky <szh.s...@gmail.com>
> wrote:
>>
>> Hello guys,
>>
>&g
Hello guys,
I've started to migrate my Spark jobs which use Accumulators V1 to
AccumulatorV2 and faced with the following issues:
1. LegacyAccumulatorWrapper now requires the resulting type of
AccumulableParam to implement equals. In other case the
AccumulableParam, automatically wrapped into
the job completes successfully
On Wed, Mar 28, 2018 at 10:31 PM, Jörn Franke <jornfra...@gmail.com> wrote:
> Encoding issue of the data? Eg spark uses utf-8 , but source encoding is
> different?
>
>> On 28. Mar 2018, at 20:25, Sergey Zhemzhitsky <szh.s...@gmail.com> wrot
Hello guys,
I'm using Spark 2.2.0 and from time to time my job fails printing into
the log the following errors
scala.MatchError:
profiles.total^@^@f2-a733-9304fda722ac^@^@^@^@profiles.10361.10005^@^@^@^@.total^@^@0075^@^@^@^@
scala.MatchError: pr^?files.10056.10040 (of class java.lang.String)
Hi PySparkers,
What currently is the best way of shipping self-contained pyspark jobs
with 3rd-party dependencies?
There are some open JIRA issues [1], [2] as well as corresponding PRs
[3], [4] and articles [5], [6], [7] regarding setting up the python
environment with conda and virtualenv
Hi PySparkers,
What currently is the best way of shipping self-contained pyspark jobs with
3rd-party dependencies?
There are some open JIRA issues [1], [2] as well as corresponding PRs [3],
[4] and articles [5], [6], regarding setting up the python environment with
conda and virtualenv
Hello spark gurus,
Could you please shed some light on what is the purpose of having two
identical functions in RDD,
RDD.context [1] and RDD.sparkContext [2].
RDD.context seems to be used more frequently across the source code.
[1]
lace, respectively.
Jacek
On 13 May 2017 3:00 p.m., "Sergey Zhemzhitsky" <szh.s...@gmail.com> wrote:
> Hello Spark users,
>
> I just would like to know whether the GraphX component should be
> considered deprecated and no longer actively maintained
> and should not
Hello Spark users,
I just would like to know whether the GraphX component should be considered
deprecated and no longer actively maintained
and should not be considered when starting new graph-processing projects on top
of Spark in favour of other
graph-processing frameworks?
I'm asking
13 matches
Mail list logo