[jira] [Created] (KAFKA-7849) Warning when adding GlobalKTable

2019-01-21 Thread Dmitry Minkovsky (JIRA)
Dmitry Minkovsky created KAFKA-7849:
---

 Summary: Warning when adding GlobalKTable
 Key: KAFKA-7849
 URL: https://issues.apache.org/jira/browse/KAFKA-7849
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.1.0
Reporter: Dmitry Minkovsky


Per 
https://lists.apache.org/thread.html/59c119be8a2723c501e0653fa3ed571e8c09be40d5b5170c151528b5@%3Cusers.kafka.apache.org%3E
 
When I add a GlobalKTable for topic "message-write-service-user-ids-by-email" 
to my topology, I get this warning:
 
[2019-01-19 12:18:14,008] WARN 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:421) [Consumer 
clientId=message-write-service-55f2ca4d-0efc-4344-90d3-955f9f5a65fd-StreamThread-2-consumer,
 groupId=message-write-service] The following subscribed topics are not 
assigned to any members: [message-write-service-user-ids-by-email] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7536) TopologyTestDriver cannot pre-populate GlobalKTable

2018-10-23 Thread Dmitry Minkovsky (JIRA)
Dmitry Minkovsky created KAFKA-7536:
---

 Summary: TopologyTestDriver cannot pre-populate GlobalKTable
 Key: KAFKA-7536
 URL: https://issues.apache.org/jira/browse/KAFKA-7536
 Project: Kafka
  Issue Type: Bug
Reporter: Dmitry Minkovsky


I have a GlobalKTable that's defined as

{code}

GlobalKTable userIdsByEmail = topology  
   .globalTable(USER_IDS_BY_EMAIL.name,
   USER_IDS_BY_EMAIL.consumed(),
   Materialized.as("user-ids-by-email"));
{code}

And the following test in Spock:

{code}
def topology = // my topology
def driver = new TopologyTestDriver(topology, config())

def cleanup() {
driver.close()
}

def "create from email request"() {

def store = driver.getKeyValueStore('user-ids-by-email')
store.put('string', ByteString.copyFrom(new byte[0]))
{code}

When I run this, I get the following:

{code}

[2018-10-23 19:35:27,055] INFO 
(org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl) mock 
Restoring state for global store user-ids-by-email

java.lang.NullPointerException
at 
org.apache.kafka.streams.processor.internals.AbstractProcessorContext.topic(AbstractProcessorContext.java:115)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.putInternal(CachingKeyValueStore.java:237)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:220)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:38)
at 
org.apache.kafka.streams.state.internals.InnerMeteredKeyValueStore.put(InnerMeteredKeyValueStore.java:206)
at 
org.apache.kafka.streams.state.internals.MeteredKeyValueBytesStore.put(MeteredKeyValueBytesStore.java:117)
at pony.message.MessageWriteStreamsTest.create from mailgun email 
request(MessageWriteStreamsTest.groovy:52)

[2018-10-23 19:35:27,189] INFO 
(org.apache.kafka.streams.processor.internals.StateDirectory) stream-thread 
[main] Deleting state directory 0_0 for task 0_0 as user calling cleanup.
{code}

I've noticed that I can {{put()}} to the store if I first write to it with 
{{driver.pipeInput}}. But otherwise I get the above error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-5824) Cannot write to key value store provided by ProcessorTopologyTestDriver

2017-09-01 Thread Dmitry Minkovsky (JIRA)
Dmitry Minkovsky created KAFKA-5824:
---

 Summary: Cannot write to key value store provided by 
ProcessorTopologyTestDriver
 Key: KAFKA-5824
 URL: https://issues.apache.org/jira/browse/KAFKA-5824
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.11.0.0
Reporter: Dmitry Minkovsky


I am trying to `put()` to a KeyValueStore that I got from 
ProcessorTopologyTestDriver#getKeyValueStore() as part of setup for a test. The 
JavaDoc endorses this use-case:

 * This is often useful in test cases to pre-populate the store before the 
test case instructs the topology to
 * {@link #process(String, byte[], byte[]) process an input message}, 
and/or to check the store afterward.

However, the `put()` results in the following error: 

{{
java.lang.IllegalStateException: This should not happen as offset() should only 
be called while a record is processed

at 
org.apache.kafka.streams.processor.internals.AbstractProcessorContext.offset(AbstractProcessorContext.java:139)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:193)
at 
org.apache.kafka.streams.state.internals.CachingKeyValueStore.put(CachingKeyValueStore.java:188)
at 
pony.UserEntityTopologySupplierTest.confirm-settings-requests(UserEntityTopologySupplierTest.groovy:81)
}}

This error seems straightforward: I am not doing the `put` within the context 
of stream processing. How do I reconcile this with the fact that I am trying to 
populate the store for a test, which the JavaDoc endorses?

Thank you,
Dmitry



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5573) kaka-clients 0.11.0.0 AdminClient#createTopics() does not set configs

2017-07-08 Thread Dmitry Minkovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Minkovsky resolved KAFKA-5573.
-
   Resolution: Not A Bug
Fix Version/s: 0.11.1.0

> kaka-clients 0.11.0.0 AdminClient#createTopics() does not set configs
> -
>
> Key: KAFKA-5573
> URL: https://issues.apache.org/jira/browse/KAFKA-5573
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.11.0.0
>Reporter: Dmitry Minkovsky
> Fix For: 0.11.1.0
>
>
> I am creating topics like
> ```
>private void createTopics(String[] topics, Map config) {
> log.info("creating topics: {} with config: {}", names, config);
> CreateTopicsResult result = admin.createTopics(
>   Arrays
> .stream(topics)
> .map(topic -> new NewTopic(topic, partitions, replication))
> .collect(Collectors.toList())
> );
> for (Map.Entry> entry : 
> result.values().entrySet()) {
> try {
> entry.getValue().get();
> log.info("topic {} created", entry.getKey());
> } catch (InterruptedException | ExecutionException e) {
> if (Throwables.getRootCause(e) instanceof 
> TopicExistsException) {
> log.info("topic {} existed", entry.getKey());
> }
> }
> }
> }
> ```
> where I call this function like 
> ```
> Map config = new HashMap<>();
> config.put("cleanup.policy", "compact");
> createTopics(new String[]{"topic"}, config);
> ```
> However, when I inspect the topic with 
> ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics 
> --entity-name topic --describe
> or 
> ./kafka-topics.sh --zookeeper localhost:2181 --describe --topic topic
> there are no configs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5573) kaka-clients 0.11.0.0 AdminClient#createTopics does not set configs

2017-07-08 Thread Dmitry Minkovsky (JIRA)
Dmitry Minkovsky created KAFKA-5573:
---

 Summary: kaka-clients 0.11.0.0 AdminClient#createTopics does not 
set configs
 Key: KAFKA-5573
 URL: https://issues.apache.org/jira/browse/KAFKA-5573
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.11.0.0
Reporter: Dmitry Minkovsky


I am creating topics like

```
   private void createTopics(String[] topics, Map config) {

log.info("creating topics: {} with config: {}", names, config);

CreateTopicsResult result = admin.createTopics(
  Arrays
.stream(topics)
.map(topic -> new NewTopic(topic, partitions, replication))
.collect(Collectors.toList())
);

for (Map.Entry> entry : 
result.values().entrySet()) {
try {
entry.getValue().get();
log.info("topic {} created", entry.getKey());
} catch (InterruptedException | ExecutionException e) {
if (Throwables.getRootCause(e) instanceof TopicExistsException) 
{
log.info("topic {} existed", entry.getKey());
}
}
}
}
```

where I call this function like 

```
Map config = new HashMap<>();
config.put("cleanup.policy", "compact");
createTopics(new String[]{"topic"}, config);
```

However, when I inspect the topic with 

./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics 
--entity-name topic --describe

or 

./kafka-topics.sh --zookeeper localhost:2181 --describe --topic topic

there are no configs.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-4628) Support KTable/GlobalKTable Joins

2017-06-07 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041869#comment-16041869
 ] 

Dmitry Minkovsky edited comment on KAFKA-4628 at 6/7/17 11:15 PM:
--

on KAFKA-4880 (a dupe of this ticket), @Michael Noll writes:

> FWIW, I wonder how much interest there actually is for this functionality. 
> Users have been requesting stream-globalTable joins, but personally I have 
> yet to run into a person that wants table-globalTable joins. Just saying.

I hadn't noticed that GlobalKTables were severely limited and assumed they had 
all the functionality of regular tables. I had been really excited to use them 
for "hydration"/denormalization. I have two models which will be present in my 
application in different quantities. Instances of one model will be very few in 
quantity compared to instances of the second model. I want to "hydrate" 
instances of the high-quantity model with instances of the low-quantity model, 
which can be available in its entirety on all streams app instances. 
KTable/KTable joins are great for denormalization because you can drive the 
update from both sides of the relationship. KTable/GlobalKTable joins would 
facilitates such updates while removing the need to repartition the left table. 


was (Author: dminkovsky):
on KAFKA-4880 (a dupe of this ticket), @Michael Noll writes:

> FWIW, I wonder how much interest there actually is for this functionality. 
> Users have been requesting stream-globalTable joins, but personally I have 
> yet to run into a person that wants table-globalTable joins. Just saying.

I hadn't noticed that GlobalKTables were severely limited and assumed they had 
all the functionality of regular tables. I had been really excited to use them 
for "hydration"/denormalization. I have two models which will be present in my 
application in different quantities. Instances of one model will be very few in 
quantity compared to instances of the second model. I want to "hydrate" 
instances of the high-quantity model with instances of the low-quantity model, 
which can be available in its entirety at all instances. KTable/KTable joins 
are great for denormalization because you can drive the update from both sides 
of the relationship. KTable/GlobalKTable joins would facilitates such updates 
while removing the need to repartition the left table. 

> Support KTable/GlobalKTable Joins
> -
>
> Key: KAFKA-4628
> URL: https://issues.apache.org/jira/browse/KAFKA-4628
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
> Fix For: 0.11.1.0
>
>
> In KIP-99 we have added support for GlobalKTables, however we don't currently 
> support KTable/GlobalKTable joins as they require materializing a state store 
> for the join. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-4628) Support KTable/GlobalKTable Joins

2017-06-07 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041869#comment-16041869
 ] 

Dmitry Minkovsky edited comment on KAFKA-4628 at 6/7/17 11:14 PM:
--

on KAFKA-4880 (a dupe of this ticket), @Michael Noll writes:

> FWIW, I wonder how much interest there actually is for this functionality. 
> Users have been requesting stream-globalTable joins, but personally I have 
> yet to run into a person that wants table-globalTable joins. Just saying.

I hadn't noticed that GlobalKTables were severely limited and assumed they had 
all the functionality of regular tables. I had been really excited to use them 
for "hydration"/denormalization. I have two models which will be present in my 
application in different quantities. Instances of one model will be very few in 
quantity compared to instances of the second model. I want to "hydrate" 
instances of the high-quantity model with instances of the low-quantity model, 
which can be available in its entirety at all instances. KTable/KTable joins 
are great for denormalization because you can drive the update from both sides 
of the relationship. KTable/GlobalKTable joins would facilitates such updates 
while removing the need to repartition the left table. 


was (Author: dminkovsky):
on KAFKA-4880 (a dupe of this ticket), @Michael Noll writes:

> FWIW, I wonder how much interest there actually is for this functionality. 
> Users have been requesting stream-globalTable joins, but personally I have 
> yet to run into a person that wants table-globalTable joins. Just saying.

I hadn't noticed that GlobalKTables were severely limited and assumed they had 
all the functionality of regular tables. I had been really excited to use them 
for "hydration" or denormalization. I have two models which will be present in 
my application in different quantities. Instances of one model will be very few 
in quantity compared to instances of the second model. I want to "hydrate" 
instances of the high-quantity model with instances of the low-quantity model, 
which can be available in its entirety at all instances. KTable/KTable joins 
are great for denormalization because you can drive the update from both sides 
of the relationship. KTable/GlobalKTable joins would facilitates such updates 
while removing the need to repartition the left table. 

> Support KTable/GlobalKTable Joins
> -
>
> Key: KAFKA-4628
> URL: https://issues.apache.org/jira/browse/KAFKA-4628
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
> Fix For: 0.11.1.0
>
>
> In KIP-99 we have added support for GlobalKTables, however we don't currently 
> support KTable/GlobalKTable joins as they require materializing a state store 
> for the join. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-4628) Support KTable/GlobalKTable Joins

2017-06-07 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041869#comment-16041869
 ] 

Dmitry Minkovsky edited comment on KAFKA-4628 at 6/7/17 11:13 PM:
--

on KAFKA-4880 (a dupe of this ticket), @Michale Noll writes:

> FWIW, I wonder how much interest there actually is for this functionality. 
> Users have been requesting stream-globalTable joins, but personally I have 
> yet to run into a person that wants table-globalTable joins. Just saying.

I hadn't noticed that GlobalKTables were severely limited and assumed they had 
all the functionality of regular tables. I had been really excited to use them 
for "hydration" or denormalization. I have two models which will be present in 
my application in different quantities. Instances of one model will be very few 
in quantity compared to instances of the second model. I want to "hydrate" 
instances of the high-quantity model with instances of the low-quantity model, 
which can be available in its entirety at all instances. KTable/KTable joins 
are great for denormalization because you can drive the update from both sides 
of the relationship. KTable/GlobalKTable joins would facilitates such updates 
while removing the need to repartition the left table. 


was (Author: dminkovsky):
on KAFKA-4880 (a dupe of this ticket), Michale Noll writes:

> FWIW, I wonder how much interest there actually is for this functionality. 
> Users have been requesting stream-globalTable joins, but personally I have 
> yet to run into a person that wants table-globalTable joins. Just saying.

I hadn't noticed that GlobalKTables were severely limited and assumed they had 
all the functionality of regular tables. I had been really excited to use them 
for "hydration" or denormalization. I have two models which will be present in 
my application in different quantities. Instances of one model will be very few 
in quantity compared to instances of the second model. I want to "hydrate" 
instances of the high-quantity model with instances of the low-quantity model, 
which can be available in its entirety at all instances. KTable/KTable joins 
are great for denormalization because you can drive the update from both sides 
of the relationship. KTable/GlobalKTable joins would facilitates such updates 
while removing the need to repartition the left table. 

> Support KTable/GlobalKTable Joins
> -
>
> Key: KAFKA-4628
> URL: https://issues.apache.org/jira/browse/KAFKA-4628
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
> Fix For: 0.11.1.0
>
>
> In KIP-99 we have added support for GlobalKTables, however we don't currently 
> support KTable/GlobalKTable joins as they require materializing a state store 
> for the join. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-4628) Support KTable/GlobalKTable Joins

2017-06-07 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041869#comment-16041869
 ] 

Dmitry Minkovsky edited comment on KAFKA-4628 at 6/7/17 11:14 PM:
--

on KAFKA-4880 (a dupe of this ticket), @Michael Noll writes:

> FWIW, I wonder how much interest there actually is for this functionality. 
> Users have been requesting stream-globalTable joins, but personally I have 
> yet to run into a person that wants table-globalTable joins. Just saying.

I hadn't noticed that GlobalKTables were severely limited and assumed they had 
all the functionality of regular tables. I had been really excited to use them 
for "hydration" or denormalization. I have two models which will be present in 
my application in different quantities. Instances of one model will be very few 
in quantity compared to instances of the second model. I want to "hydrate" 
instances of the high-quantity model with instances of the low-quantity model, 
which can be available in its entirety at all instances. KTable/KTable joins 
are great for denormalization because you can drive the update from both sides 
of the relationship. KTable/GlobalKTable joins would facilitates such updates 
while removing the need to repartition the left table. 


was (Author: dminkovsky):
on KAFKA-4880 (a dupe of this ticket), @Michale Noll writes:

> FWIW, I wonder how much interest there actually is for this functionality. 
> Users have been requesting stream-globalTable joins, but personally I have 
> yet to run into a person that wants table-globalTable joins. Just saying.

I hadn't noticed that GlobalKTables were severely limited and assumed they had 
all the functionality of regular tables. I had been really excited to use them 
for "hydration" or denormalization. I have two models which will be present in 
my application in different quantities. Instances of one model will be very few 
in quantity compared to instances of the second model. I want to "hydrate" 
instances of the high-quantity model with instances of the low-quantity model, 
which can be available in its entirety at all instances. KTable/KTable joins 
are great for denormalization because you can drive the update from both sides 
of the relationship. KTable/GlobalKTable joins would facilitates such updates 
while removing the need to repartition the left table. 

> Support KTable/GlobalKTable Joins
> -
>
> Key: KAFKA-4628
> URL: https://issues.apache.org/jira/browse/KAFKA-4628
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
> Fix For: 0.11.1.0
>
>
> In KIP-99 we have added support for GlobalKTables, however we don't currently 
> support KTable/GlobalKTable joins as they require materializing a state store 
> for the join. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4628) Support KTable/GlobalKTable Joins

2017-06-07 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041869#comment-16041869
 ] 

Dmitry Minkovsky commented on KAFKA-4628:
-

on KAFKA-4880 (a dupe of this ticket), Michale Noll writes:

> FWIW, I wonder how much interest there actually is for this functionality. 
> Users have been requesting stream-globalTable joins, but personally I have 
> yet to run into a person that wants table-globalTable joins. Just saying.

I hadn't noticed that GlobalKTables were severely limited and assumed they had 
all the functionality of regular tables. I had been really excited to use them 
for "hydration" or denormalization. I have two models which will be present in 
my application in different quantities. Instances of one model will be very few 
in quantity compared to instances of the second model. I want to "hydrate" 
instances of the high-quantity model with instances of the low-quantity model, 
which can be available in its entirety at all instances. KTable/KTable joins 
are great for denormalization because you can drive the update from both sides 
of the relationship. KTable/GlobalKTable joins would facilitates such updates 
while removing the need to repartition the left table. 

> Support KTable/GlobalKTable Joins
> -
>
> Key: KAFKA-4628
> URL: https://issues.apache.org/jira/browse/KAFKA-4628
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
> Fix For: 0.11.1.0
>
>
> In KIP-99 we have added support for GlobalKTables, however we don't currently 
> support KTable/GlobalKTable joins as they require materializing a state store 
> for the join. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values

2017-05-11 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16006486#comment-16006486
 ] 

Dmitry Minkovsky commented on KAFKA-4750:
-

Affects 0.10.2.1 also

> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1
>Reporter: Michal Borowiecki
>Assignee: Kamal Chandraprakash
>  Labels: newbie
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5105) ReadOnlyKeyValueStore range scans are not ordered

2017-04-21 Thread Dmitry Minkovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Minkovsky updated KAFKA-5105:

Description: 
Following up with this thread 
https://www.mail-archive.com/users@kafka.apache.org/msg25373.html

Although ReadOnlyKeyValueStore's #range() is documented not to returns values 
in order, it would be great if it would for keys within a single partition. 
This would facilitate using interactive queries and local state as one would 
use HBase to index data by prefixed keys. If range returned keys in 
lexicographical order, I could use interactive queries for all my data needs 
except search.

  was:
Following up with this thread 
https://www.mail-archive.com/users@kafka.apache.org/msg25373.html

Although ReadOnlyKeyValueStore's #range() is documented not to returns values 
in order, it would be great if it would for keys within a single partition. 
This would facilitate using interactive queries and local state as one would 
use HBase to index data by prefixed keys. 


> ReadOnlyKeyValueStore range scans are not ordered
> -
>
> Key: KAFKA-5105
> URL: https://issues.apache.org/jira/browse/KAFKA-5105
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Dmitry Minkovsky
>
> Following up with this thread 
> https://www.mail-archive.com/users@kafka.apache.org/msg25373.html
> Although ReadOnlyKeyValueStore's #range() is documented not to returns values 
> in order, it would be great if it would for keys within a single partition. 
> This would facilitate using interactive queries and local state as one would 
> use HBase to index data by prefixed keys. If range returned keys in 
> lexicographical order, I could use interactive queries for all my data needs 
> except search.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5105) ReadOnlyKeyValueStore range scans are not ordered

2017-04-21 Thread Dmitry Minkovsky (JIRA)
Dmitry Minkovsky created KAFKA-5105:
---

 Summary: ReadOnlyKeyValueStore range scans are not ordered
 Key: KAFKA-5105
 URL: https://issues.apache.org/jira/browse/KAFKA-5105
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.2.0
Reporter: Dmitry Minkovsky


Following up with this thread 
https://www.mail-archive.com/users@kafka.apache.org/msg25373.html

Although ReadOnlyKeyValueStore's #range() is documented not to returns values 
in order, it would be great if it would for keys within a single partition. 
This would facilitate using interactive queries and local state as one would 
use HBase to index data by prefixed keys. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values

2017-03-17 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930379#comment-15930379
 ] 

Dmitry Minkovsky commented on KAFKA-4750:
-

Can confirm I experienced this on 0.10.1 as well. 

> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1
>Reporter: Michal Borowiecki
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-3901) KStreamTransformValues$KStreamTransformValuesProcessor#process() forwards null values

2016-11-08 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649270#comment-15649270
 ] 

Dmitry Minkovsky edited comment on KAFKA-3901 at 11/9/16 12:08 AM:
---

Also affects 0.10.1.0. How about a PR?


was (Author: dminkovsky):
Also affects 0.10.1.0 <3

> KStreamTransformValues$KStreamTransformValuesProcessor#process() forwards 
> null values
> -
>
> Key: KAFKA-3901
> URL: https://issues.apache.org/jira/browse/KAFKA-3901
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Dmitry Minkovsky
>Assignee: Guozhang Wang
>Priority: Minor
>
> Did this get missed in [KAFKA-3519: Refactor Transformer's transform / 
> punctuate to return nullable 
> value|https://github.com/apache/kafka/commit/40fd456649b5df29d030da46865b5e7e0ca6db15#diff-338c230fd5a15d98550230007651a224]?
>  I think it may have, because that processor's #punctuate() does not forward 
> null. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3901) KStreamTransformValues$KStreamTransformValuesProcessor#process() forwards null values

2016-11-08 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649270#comment-15649270
 ] 

Dmitry Minkovsky commented on KAFKA-3901:
-

Also affects 0.10.1.0 <3

> KStreamTransformValues$KStreamTransformValuesProcessor#process() forwards 
> null values
> -
>
> Key: KAFKA-3901
> URL: https://issues.apache.org/jira/browse/KAFKA-3901
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Dmitry Minkovsky
>Assignee: Guozhang Wang
>Priority: Minor
>
> Did this get missed in [KAFKA-3519: Refactor Transformer's transform / 
> punctuate to return nullable 
> value|https://github.com/apache/kafka/commit/40fd456649b5df29d030da46865b5e7e0ca6db15#diff-338c230fd5a15d98550230007651a224]?
>  I think it may have, because that processor's #punctuate() does not forward 
> null. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3901) KStreamTransformValues$KStreamTransformValuesProcessor#process() forwards null values

2016-06-24 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15348624#comment-15348624
 ] 

Dmitry Minkovsky commented on KAFKA-3901:
-

...and KStreamTransform$KStreamTransformProcessor#process() does not forward 
null values.

Anyway, I marked this minor because I can use the non-value process, so there 
is a workaround.

> KStreamTransformValues$KStreamTransformValuesProcessor#process() forwards 
> null values
> -
>
> Key: KAFKA-3901
> URL: https://issues.apache.org/jira/browse/KAFKA-3901
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Dmitry Minkovsky
>Assignee: Guozhang Wang
>Priority: Minor
>
> Did this get missed in [KAFKA-3519: Refactor Transformer's transform / 
> punctuate to return nullable 
> value|https://github.com/apache/kafka/commit/40fd456649b5df29d030da46865b5e7e0ca6db15#diff-338c230fd5a15d98550230007651a224]?
>  I think it may have, because that processor's #punctuate() does not forward 
> null. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3901) KStreamTransformValues$KStreamTransformValuesProcessor#process() forwards null values

2016-06-24 Thread Dmitry Minkovsky (JIRA)
Dmitry Minkovsky created KAFKA-3901:
---

 Summary: 
KStreamTransformValues$KStreamTransformValuesProcessor#process() forwards null 
values
 Key: KAFKA-3901
 URL: https://issues.apache.org/jira/browse/KAFKA-3901
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.0.0
Reporter: Dmitry Minkovsky
Assignee: Guozhang Wang
Priority: Minor


Did this get missed in [KAFKA-3519: Refactor Transformer's transform / 
punctuate to return nullable 
value|https://github.com/apache/kafka/commit/40fd456649b5df29d030da46865b5e7e0ca6db15#diff-338c230fd5a15d98550230007651a224]?
 I think it may have, because that processor's #punctuate() does not forward 
null. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3862) Kafka streams documentation: partition<->thread<->task assignment unclear

2016-06-17 Thread Dmitry Minkovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336800#comment-15336800
 ] 

Dmitry Minkovsky commented on KAFKA-3862:
-

Oh hey, I somehow missed 0.10 and Confluent 3.0. Current version of docs are 
much clearer/don't have this issue. 

> Kafka streams documentation: partition<->thread<->task assignment unclear
> -
>
> Key: KAFKA-3862
> URL: https://issues.apache.org/jira/browse/KAFKA-3862
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
> Environment: N/A
>Reporter: Dmitry Minkovsky
>Assignee: Guozhang Wang
>Priority: Minor
> Fix For: 0.10.0.0
>
>
> Hi. Not sure if this is the right place to post about documentation (this 
> isn't actually a software bug).
> I believe the following span of documentation is self-contradictory:
> !http://i.imgur.com/WM8ada2.png!
> In the paragraph text before and after the image, it says the partitions are 
> *evenly distributed* among the threads. However, in the image, Thread 1 gets 
> p1 and p3 (4 partitions, total) while Thread 2 gets p2 (2 partitions, total). 
> That's "even" in the sense that you can't do any better, but still not even, 
> and in conflict with the text that follows the image:
> !http://i.imgur.com/xq3SOre.png!
> This doesn't make sense because why would p2 from topicA and topicB go to 
> different threads? if this were the cause, you couldn't join the partitions, 
> right?
> Either way, it seems like the two things are in conflict.
> Finally, the wording:
> bq. ...assume that each thread executes a single stream task...
> What does "assume" mean here? I've not seen any indication elsewhere in the 
> documentation that shows you can configure this behavior. Sorry if I missed 
> this.
> Thank you!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-3862) Kafka streams documentation: partition<->thread<->task assignment unclear

2016-06-17 Thread Dmitry Minkovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Minkovsky resolved KAFKA-3862.
-
   Resolution: Fixed
Fix Version/s: 0.10.0.0

> Kafka streams documentation: partition<->thread<->task assignment unclear
> -
>
> Key: KAFKA-3862
> URL: https://issues.apache.org/jira/browse/KAFKA-3862
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
> Environment: N/A
>Reporter: Dmitry Minkovsky
>Assignee: Guozhang Wang
>Priority: Minor
> Fix For: 0.10.0.0
>
>
> Hi. Not sure if this is the right place to post about documentation (this 
> isn't actually a software bug).
> I believe the following span of documentation is self-contradictory:
> !http://i.imgur.com/WM8ada2.png!
> In the paragraph text before and after the image, it says the partitions are 
> *evenly distributed* among the threads. However, in the image, Thread 1 gets 
> p1 and p3 (4 partitions, total) while Thread 2 gets p2 (2 partitions, total). 
> That's "even" in the sense that you can't do any better, but still not even, 
> and in conflict with the text that follows the image:
> !http://i.imgur.com/xq3SOre.png!
> This doesn't make sense because why would p2 from topicA and topicB go to 
> different threads? if this were the cause, you couldn't join the partitions, 
> right?
> Either way, it seems like the two things are in conflict.
> Finally, the wording:
> bq. ...assume that each thread executes a single stream task...
> What does "assume" mean here? I've not seen any indication elsewhere in the 
> documentation that shows you can configure this behavior. Sorry if I missed 
> this.
> Thank you!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3862) Kafka streams documentation: partition<->thread<->task assignment unclear

2016-06-17 Thread Dmitry Minkovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Minkovsky updated KAFKA-3862:

Description: 
Hi. Not sure if this is the right place to post about documentation (this isn't 
actually a software bug).

I believe the following span of documentation is self-contradictory:

!http://i.imgur.com/WM8ada2.png!

In the paragraph text before and after the image, it says the partitions are 
*evenly distributed* among the threads. However, in the image, Thread 1 gets p1 
and p3 (4 partitions, total) while Thread 2 gets p2 (2 partitions, total). 
That's "even" in the sense that you can't do any better, but still not even, 
and in conflict with the text that follows the image:

!http://i.imgur.com/xq3SOre.png!

This doesn't make sense because why would p2 from topicA and topicB go to 
different threads? if this were the cause, you couldn't join the partitions, 
right?

Either way, it seems like the two things are in conflict.

Finally, the wording:

bq. ...assume that each thread executes a single stream task...

What does "assume" mean here? I've not seen any indication elsewhere in the 
documentation that show you can configure this behavior. Sorry if I missed this.

Thank you!


  was:
Hi. Not sure if this is the right place to post about documentation (this isn't 
actually a software bug).

I believe the following span of documentation is self-contradictory:

!http://i.imgur.com/WM8ada2.png!

In the paragraph text before and after the image, it says the partitions are 
*evenly distributed* among the threads. However, in the image, Thread 1 gets p1 
and p3 (4 partitions, total) while Thread 2 gets p2 (2 partitions, total). 
That's "even" in the sense that you can't do any better, but still not even, 
and in conflict with the text that follows the image:

!http://i.imgur.com/xq3SOre.png!

This doesn't make sense because why would p2 from topicA and topicB go to 
different threads? if this were the cause, you couldn't join the partitions, 
right?

Either way, it seems like the two things are in conflict.

Finally, the wording:

bq. ...assume that each thread executes a single stream task...

What does "assume" mean here. I've not seen any indication elsewhere in the 
documentation that show you can configure this behavior. Sorry if I missed this.

Thank you!



> Kafka streams documentation: partition<->thread<->task assignment unclear
> -
>
> Key: KAFKA-3862
> URL: https://issues.apache.org/jira/browse/KAFKA-3862
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
> Environment: N/A
>Reporter: Dmitry Minkovsky
>Assignee: Guozhang Wang
>Priority: Minor
>
> Hi. Not sure if this is the right place to post about documentation (this 
> isn't actually a software bug).
> I believe the following span of documentation is self-contradictory:
> !http://i.imgur.com/WM8ada2.png!
> In the paragraph text before and after the image, it says the partitions are 
> *evenly distributed* among the threads. However, in the image, Thread 1 gets 
> p1 and p3 (4 partitions, total) while Thread 2 gets p2 (2 partitions, total). 
> That's "even" in the sense that you can't do any better, but still not even, 
> and in conflict with the text that follows the image:
> !http://i.imgur.com/xq3SOre.png!
> This doesn't make sense because why would p2 from topicA and topicB go to 
> different threads? if this were the cause, you couldn't join the partitions, 
> right?
> Either way, it seems like the two things are in conflict.
> Finally, the wording:
> bq. ...assume that each thread executes a single stream task...
> What does "assume" mean here? I've not seen any indication elsewhere in the 
> documentation that show you can configure this behavior. Sorry if I missed 
> this.
> Thank you!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3862) Kafka streams documentation: partition<->thread<->task assignment unclear

2016-06-17 Thread Dmitry Minkovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Minkovsky updated KAFKA-3862:

Description: 
Hi. Not sure if this is the right place to post about documentation (this isn't 
actually a software bug).

I believe the following span of documentation is self-contradictory:

!http://i.imgur.com/WM8ada2.png!

In the paragraph text before and after the image, it says the partitions are 
*evenly distributed* among the threads. However, in the image, Thread 1 gets p1 
and p3 (4 partitions, total) while Thread 2 gets p2 (2 partitions, total). 
That's "even" in the sense that you can't do any better, but still not even, 
and in conflict with the text that follows the image:

!http://i.imgur.com/xq3SOre.png!

This doesn't make sense because why would p2 from topicA and topicB go to 
different threads? if this were the cause, you couldn't join the partitions, 
right?

Either way, it seems like the two things are in conflict.

Finally, the wording:

bq. ...assume that each thread executes a single stream task...

What does "assume" mean here? I've not seen any indication elsewhere in the 
documentation that shows you can configure this behavior. Sorry if I missed 
this.

Thank you!


  was:
Hi. Not sure if this is the right place to post about documentation (this isn't 
actually a software bug).

I believe the following span of documentation is self-contradictory:

!http://i.imgur.com/WM8ada2.png!

In the paragraph text before and after the image, it says the partitions are 
*evenly distributed* among the threads. However, in the image, Thread 1 gets p1 
and p3 (4 partitions, total) while Thread 2 gets p2 (2 partitions, total). 
That's "even" in the sense that you can't do any better, but still not even, 
and in conflict with the text that follows the image:

!http://i.imgur.com/xq3SOre.png!

This doesn't make sense because why would p2 from topicA and topicB go to 
different threads? if this were the cause, you couldn't join the partitions, 
right?

Either way, it seems like the two things are in conflict.

Finally, the wording:

bq. ...assume that each thread executes a single stream task...

What does "assume" mean here? I've not seen any indication elsewhere in the 
documentation that show you can configure this behavior. Sorry if I missed this.

Thank you!



> Kafka streams documentation: partition<->thread<->task assignment unclear
> -
>
> Key: KAFKA-3862
> URL: https://issues.apache.org/jira/browse/KAFKA-3862
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
> Environment: N/A
>Reporter: Dmitry Minkovsky
>Assignee: Guozhang Wang
>Priority: Minor
>
> Hi. Not sure if this is the right place to post about documentation (this 
> isn't actually a software bug).
> I believe the following span of documentation is self-contradictory:
> !http://i.imgur.com/WM8ada2.png!
> In the paragraph text before and after the image, it says the partitions are 
> *evenly distributed* among the threads. However, in the image, Thread 1 gets 
> p1 and p3 (4 partitions, total) while Thread 2 gets p2 (2 partitions, total). 
> That's "even" in the sense that you can't do any better, but still not even, 
> and in conflict with the text that follows the image:
> !http://i.imgur.com/xq3SOre.png!
> This doesn't make sense because why would p2 from topicA and topicB go to 
> different threads? if this were the cause, you couldn't join the partitions, 
> right?
> Either way, it seems like the two things are in conflict.
> Finally, the wording:
> bq. ...assume that each thread executes a single stream task...
> What does "assume" mean here? I've not seen any indication elsewhere in the 
> documentation that shows you can configure this behavior. Sorry if I missed 
> this.
> Thank you!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3862) Kafka streams documentation: partition<->thread<->task assignment unclear

2016-06-17 Thread Dmitry Minkovsky (JIRA)
Dmitry Minkovsky created KAFKA-3862:
---

 Summary: Kafka streams documentation: partition<->thread<->task 
assignment unclear
 Key: KAFKA-3862
 URL: https://issues.apache.org/jira/browse/KAFKA-3862
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.0.0
 Environment: N/A
Reporter: Dmitry Minkovsky
Assignee: Guozhang Wang
Priority: Minor


Hi. Not sure if this is the right place to post about documentation (this isn't 
actually a software bug).

I believe the following span of documentation is self-contradictory:

!http://i.imgur.com/WM8ada2.png!

In the paragraph text before and after the image, it says the partitions are 
*evenly distributed* among the threads. However, in the image, Thread 1 gets p1 
and p3 (4 partitions, total) while Thread 2 gets p2 (2 partitions, total). 
That's "even" in the sense that you can't do any better, but still not even, 
and in conflict with the text that follows the image:

!http://i.imgur.com/xq3SOre.png!

This doesn't make sense because why would p2 from topicA and topicB go to 
different threads? if this were the cause, you couldn't join the partitions, 
right?

Either way, it seems like the two things are in conflict.

Finally, the wording:

bq. ...assume that each thread executes a single stream task...

What does "assume" mean here. I've not seen any indication elsewhere in the 
documentation that show you can configure this behavior. Sorry if I missed this.

Thank you!




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)