[ 
https://issues.apache.org/jira/browse/KAFKA-10383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Lotz updated KAFKA-10383:
-------------------------------
    Description: 
*Status Quo:*
 The current implementation of [KIP-213 
|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable]]
 of Foreign Key Join between two KTables is _opinionated_ in terms of storage 
layer.

Independently of the Materialization method provided in the method argument, it 
generates an intermediary RocksDB state store. Thus, even when the 
Materialization method provided is "in memory", it will use RocksDB 
under-the-hood for this internal state-store.

 

*Related problems:*
 * IT Tests: Having an implicit materialization method for state-store affects 
tests using foreign key state-stores. [On windows based systems 
|[https://stackoverflow.com/questions/50602512/failed-to-delete-the-state-directory-in-ide-for-kafka-stream-application]],
 that are affected by the RocksDB filesystem removal problem, an approach to 
avoid the bug is to use in-memory state-stores (rather than exception 
swallowing). Having the intermediate RocksDB storage being created disregarding 
materialization method forces any IT test to necessarily use the manual FS 
deletion with exception swallowing hack.
 * Short lived Streams: Ktables can be short lived in a way that neither 
persistent storage nor change-logs creation are desired. The current 
implementation prevents this.

*Suggestion:*

One possible solution is to use a similar materialization method (to the one 
provided in the argument) when creating the intermediary Foreign Key 
state-store. If the Materialization is in memory and without changelog, the 
same happens in the intermediate state-sore. 

  was:
*Status Quo:*
 The current implementation of [KIP-213 
|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable]]
 of Foreign Key Join between two KTables is _opinionated_ in terms of storage 
layer.

Independently of the Materialization method provided in the method argument, it 
generates an intermediary RocksDB state store. Thus, even when the 
Materialization method provided is "in memory", it will use RocksDB 
under-the-hood for this internal state-store.

 

*Related problems:*
 * IT Tests: Having an implicit materialization method for state-store affects 
tests using foreign key state-stores. [On windows based systems 
|[https://stackoverflow.com/questions/50602512/failed-to-delete-the-state-directory-in-ide-for-kafka-stream-application]],
 that have the RocksDB filesystem removal problem, a solution to avoid the bug 
is to use in-memory state-stores (rather than exception swallowing). Having the 
RocksDB storage being forcely created makes that any IT test necessarily use 
the manual FS deletion with exception swallow hack.
 * Short lived Streams: Sometimes, Ktables are short lived in a way that 
neither Persistance storage nor changelogs are desired. The current 
implementation prevents this.

*Suggestion:*

One possible solution is to use the same materialization method that is 
provided in the argument when creating the intermediary Foreign Key 
state-store. If the Materialization is in memory and without changelog, the 
same happens in the state-sore. 


> KTable Join on Foreign key is opinionated 
> ------------------------------------------
>
>                 Key: KAFKA-10383
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10383
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 2.4.1
>            Reporter: Marco Lotz
>            Priority: Major
>
> *Status Quo:*
>  The current implementation of [KIP-213 
> |[https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable]]
>  of Foreign Key Join between two KTables is _opinionated_ in terms of storage 
> layer.
> Independently of the Materialization method provided in the method argument, 
> it generates an intermediary RocksDB state store. Thus, even when the 
> Materialization method provided is "in memory", it will use RocksDB 
> under-the-hood for this internal state-store.
>  
> *Related problems:*
>  * IT Tests: Having an implicit materialization method for state-store 
> affects tests using foreign key state-stores. [On windows based systems 
> |[https://stackoverflow.com/questions/50602512/failed-to-delete-the-state-directory-in-ide-for-kafka-stream-application]],
>  that are affected by the RocksDB filesystem removal problem, an approach to 
> avoid the bug is to use in-memory state-stores (rather than exception 
> swallowing). Having the intermediate RocksDB storage being created 
> disregarding materialization method forces any IT test to necessarily use the 
> manual FS deletion with exception swallowing hack.
>  * Short lived Streams: Ktables can be short lived in a way that neither 
> persistent storage nor change-logs creation are desired. The current 
> implementation prevents this.
> *Suggestion:*
> One possible solution is to use a similar materialization method (to the one 
> provided in the argument) when creating the intermediary Foreign Key 
> state-store. If the Materialization is in memory and without changelog, the 
> same happens in the intermediate state-sore. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to