Hi,
I am curious about this comment:
if (offset < expected) { // -- (a)
// this can happen when compression is enabled in Kafka (seems to be
fixed in 0.10)
// should we check if the offset is way off from consumedOffset (say
> 1M)?
LOG.warn(
"{}: ignor
s/java/core/src/main/java/org/apache/beam/sdk/transforms/Partition.java
>
> On Thu, Jul 23, 2020 at 5:01 AM wang Wu wrote:
>
>> Hi,
>> Supposed that a Kafka topic has 3 partitions only. Now we want to
>> partition it into 20 partition, each one will produce an output coll
/io/cassandra/CassandraIO.java#L1139>
>
>> On 20 Jul 2020, at 12:49, wang Wu > <mailto:faran...@gmail.com>> wrote:
>>
>> But unfortunately that way will not work as Session and/or Cluster is not
>> serialisable.
>>
>> Regards
>> Dinh
>
Hi,
Supposed that a Kafka topic has 3 partitions only. Now we want to partition it
into 20 partition, each one will produce an output collection. The purpose is
to write to the sink in parallel from all 20 output collections.
Will this code achieve that purpose?
KafkaIO.Read reader =
Kaf
But unfortunately that way will not work as Session and/or Cluster is not
serialisable.
Regards
Dinh
> On 20 Jul BE 2563, at 17:42, wang Wu wrote:
>
> Hi,
>
> We are thinking of tuning connection pooling like this:
> https://docs.datastax.com/en/developer/java-driver/
tom Cluster
> instance with all required configuration.
>
> In both cases, it will require CassandraIO modification.
>
>
>> On 18 Jul 2020, at 13:11, wang Wu wrote:
>>
>> I notice that the standard Cassandra IO setup Cluster with basics settings.
>> Is
I notice that the standard Cassandra IO setup Cluster with basics settings. Is
it possible to implement custom Cassandra IO in which I can customise Datastax
driver? Any sample code will be helpful. Thanks
On 30 Jun BE 2563, at 23:53, wang Wu wrote:
>
> We encountered similar exception with KafkaUnboundedReader. By similarity I
> mean it start from
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint
>
> And it ends at org.apache.beam.sdk.io.kafka.KafkaUnboundedReader.advance
&
different threads. So we need to check it there is no issue under
> high load with that.
>
> Btw, which Kafka client version do you use?
>
>> On 30 Jun 2020, at 18:53, wang Wu > <mailto:faran...@gmail.com>> wrote:
>>
>> We encountered
6, Alexey Romanenko
> wrote:
>
> I don’t think it’s a known issue. Could you tell with version of Beam you use?
>
>> On 28 Jun 2020, at 14:43, wang Wu wrote:
>>
>> Hi,
>> We run Beam pipeline on Spark in the streaming mode. We subscribe to
>> multiple Kaf
it’s a known issue. Could you tell with version of Beam you use?
>
>> On 28 Jun 2020, at 14:43, wang Wu wrote:
>>
>> Hi,
>> We run Beam pipeline on Spark in the streaming mode. We subscribe to
>> multiple Kafka topics. Our job run fine until it is under heavy lo
Hi,
We run Beam pipeline on Spark in the streaming mode. We subscribe to multiple
Kafka topics. Our job run fine until it is under heavy load: millions of Kafka
messages coming per seconds. The exception look like concurrency issue. Is it a
known bug in Beam or some Spark configuration we could
12 matches
Mail list logo