One option is to create your main jar included with metrics jar like a fat
jar.
On Sat, Jun 27, 2020 at 8:04 AM Bryan Jeffrey
wrote:
> Srinivas,
>
> Thanks for the insight. I had not considered a dependency issue as the
> metrics jar works well applied on the driver. Perhaps my main jar
>
The connector uses Java driver cql request under the hood which means it
responds to the changing database like a normal application would. This
means retries may result in a different set of data than the original
request if the underlying database changed.
On Fri, Jun 26, 2020, 9:42 PM Jungtaek
I'm not sure how it is implemented, but in general I wouldn't expect such
behavior on the connectors which read from non-streaming fashion storages.
The query result may depend on "when" the records are fetched.
If you need to reflect the changes in your query you'll probably want to
find a way
Srinivas,
Thanks for the insight. I had not considered a dependency issue as the metrics
jar works well applied on the driver. Perhaps my main jar includes the Hadoop
dependencies but the metrics jar does not?
I am confused as the only Hadoop dependency also exists for the built in
metrics
It should work when you are giving hdfs path as long as your jar exists in
the path.
Your error is more security issue (Kerberos) or Hadoop dependencies missing
I think, your error says :
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
On Fri, Jun 26, 2020 at 8:44 PM
Cool. Are you not using watermark ?
Also, is it possible to start listening offsets from a specific date time ?
Regards
Srini
On Sat, Jun 27, 2020 at 6:12 AM Eric Beabes
wrote:
> My apologies... After I set the 'maxOffsetsPerTrigger' to a value such as
> '20' it started working. Hopefully
My apologies... After I set the 'maxOffsetsPerTrigger' to a value such as
'20' it started working. Hopefully this will help someone. Thanks.
On Fri, Jun 26, 2020 at 2:12 PM Something Something <
mailinglist...@gmail.com> wrote:
> My Spark Structured Streaming job works fine when I set
My Spark Structured Streaming job works fine when I set "startingOffsets"
to "latest". When I simply change it to "earliest" & specify a new "check
point directory", the job doesn't work. The states don't get timed out
after 10 minutes.
While debugging I noticed that my 'state' logic is indeed
Hi Jeff
Thanks for confirming the same.
I have also thought about reading every MongoDB document separately along
with their schemas and then comparing them to the schemas of all the
documents in the collection. For our huge database this is a horrible
horrible approach as you have already
Hi ,
We have a use case where one record needs to be in two different
aggregations.
Say for example a credit card transaction "A", which belongs to
transaction category ATM and crossborder.
If I need to take the count of ATM transaction, I need to consider
transaction A . For count of
It may be helpful to note that I'm running in Yarn cluster mode. My goal
is to avoid having to manually distribute the JAR to all of the various
nodes as this makes versioning deployments difficult.
On Thu, Jun 25, 2020 at 5:32 PM Bryan Jeffrey
wrote:
> Hello.
>
> I am running Spark 2.4.4. I
Hi Jorge,
If I set that in the spark submit command it works but I want it only in
the pod template file.
Best regards,
Michel
Le ven. 26 juin 2020 à 14:01, Jorge Machado a écrit :
> Try to set spark.kubernetes.container.image
>
> On 26. Jun 2020, at 14:58, Michel Sumbul wrote:
>
> Hi guys,
>
Try to set spark.kubernetes.container.image
> On 26. Jun 2020, at 14:58, Michel Sumbul wrote:
>
> Hi guys,
>
> I try to use Spark 3 on top of Kubernetes and to specify a pod template for
> the driver.
>
> Here is my pod manifest or the driver and when I do a spark-submit with the
> option:
Hi guys,
I try to use Spark 3 on top of Kubernetes and to specify a pod template for the
driver.
Here is my pod manifest or the driver and when I do a spark-submit with the
option:
--conf
spark.kubernetes.driver.podTemplateFile=/data/k8s/podtemplate_driver3.yaml
I got the error message that I
Hi guys,
I try to use Spark 3 on top of Kubernetes and to specify a pod template for
the driver.
Here is my pod manifest or the driver and when I do a spark-submit with the
option:
--conf
spark.kubernetes.driver.podTemplateFile=/data/k8s/podtemplate_driver3.yaml
I got the error message that I
15 matches
Mail list logo