CassandraAdapter (Add Type) and WHERE statement.
Hi guys , I study Calcite the benefits that a Cassandra-Calcite Adapter can bring , as for example brings the possibility of join. the problem type defined into CassandraSchema.getRelDataType(..) is very limited some important type are missing boolean / array ect... I thought inherited from the class CassandraSchema for Override this method and add more type but this method is used inside CassandraTable too. i would like to avoid to re-write fully this adapter :) do you have suggestions? My second question is : Cassandra is not optimized to have WHERE on key not defined on cluster/partition key. I was wondering if calcite could play a role without this mechanism to improve performance Thank ! Yanna
Re: CassandraAdapter (Add Type) and WHERE statement.
Thank for reply Michael. yes i understood this on the documentation for example with "WHERE" statement calcite i force the . "ALLOW FILTERING; " and this can be expensive. I think there may be an interesting approach using STREAM. for example maintain a regular update between a cassandra TABLE and a STREAM TABLE. CASSANDRA_TABLE_A .(SELECT * FROM TABLE_A) > STREAM_TABLE_A . SELECT STREAM * FROM STREAM_TABLE_A WHERE username = 'JmuhsAaMdw' i guess it will be more efficient to directly make the WHERE from the STREAM than the cassandra_adapter using "allow filtering" a synchronization strategy can be set up between the cassandra table and the STREAM table what is your opinion about this approach ? Thanks ! Yana Le mer. 16 oct. 2019 à 17:08, Michael Mior a écrit : > You're right that there are several types which are not supported by > the Cassandra adapter. We would happily accept pull requests to add > support for new types. > > You're also correct that Cassandra cannot efficiently execute queries > which do not specify the partition key. Calcite will make those > queries more efficient, but it can make it easier to execute queries > that CQL does not directly support. Ultimately data is still stored > based on the partition key, so if your query does not specify a > partition key, Calcite will still need to issue an expensive > cross-partition query to Cassandra. > -- > Michael Mior > mm...@apache.org > > Le mer. 16 oct. 2019 à 07:57, Yanna elina a > écrit : > > > > Hi guys , > > > > I study Calcite the benefits that a Cassandra-Calcite Adapter can bring , > > as for example brings the possibility of join. > > > > the problem type defined into CassandraSchema.getRelDataType(..) is very > > limited > > some important type are missing boolean / array ect... > > > > I thought inherited from the class CassandraSchema for Override this > > method and add more type but this method is used inside CassandraTable > too. > > > > i would like to avoid to re-write fully this adapter :) > > > > do you have suggestions? > > > > My second question is : Cassandra is not optimized to have WHERE on key > > not defined on cluster/partition key. I was wondering if calcite could > play > > a role without this mechanism to improve performance > > > > > > Thank ! > > > > Yanna >
Re: CassandraAdapter (Add Type) and WHERE statement.
I think it's me who does not have to understand all the subtlety. I thought that STREAM works more like a in-memory- relational database but i missed something thank for your help :) Le jeu. 17 oct. 2019 à 15:53, Michael Mior a écrit : > Perhaps I'm missing something, but I don't see why this would be any > more efficient. Selecting all data is also not an efficient operation > in Cassandra. Using ALLOW FILTERING will likely be more efficient > since it's basically the same as doing a table scan, but it avoids > returning data which would later be filtered by Calcite anyway. > -- > Michael Mior > mm...@apache.org > > Le jeu. 17 oct. 2019 à 09:13, Yanna elina a > écrit : > > > > Thank for reply Michael. > > > > yes i understood this on the documentation for example with "WHERE" > > statement calcite i force the . "ALLOW FILTERING; " > > and this can be expensive. > > > > I think there may be an interesting approach using STREAM. > > > > for example maintain a regular update between a cassandra TABLE and a > > STREAM TABLE. > > > > CASSANDRA_TABLE_A .(SELECT * FROM TABLE_A) > STREAM_TABLE_A . > > SELECT STREAM * FROM STREAM_TABLE_A WHERE username = 'JmuhsAaMdw' > > > > i guess it will be more efficient to directly make the WHERE from the > > STREAM than the cassandra_adapter using "allow filtering" > > a synchronization strategy can be set up between the cassandra table and > > the STREAM table > > what is your opinion about this approach ? > > Thanks ! > > Yana > > > > > > Le mer. 16 oct. 2019 à 17:08, Michael Mior a écrit : > > > > > You're right that there are several types which are not supported by > > > the Cassandra adapter. We would happily accept pull requests to add > > > support for new types. > > > > > > You're also correct that Cassandra cannot efficiently execute queries > > > which do not specify the partition key. Calcite will make those > > > queries more efficient, but it can make it easier to execute queries > > > that CQL does not directly support. Ultimately data is still stored > > > based on the partition key, so if your query does not specify a > > > partition key, Calcite will still need to issue an expensive > > > cross-partition query to Cassandra. > > > -- > > > Michael Mior > > > mm...@apache.org > > > > > > Le mer. 16 oct. 2019 à 07:57, Yanna elina > a > > > écrit : > > > > > > > > Hi guys , > > > > > > > > I study Calcite the benefits that a Cassandra-Calcite Adapter can > bring , > > > > as for example brings the possibility of join. > > > > > > > > the problem type defined into CassandraSchema.getRelDataType(..) is > very > > > > limited > > > > some important type are missing boolean / array ect... > > > > > > > > I thought inherited from the class CassandraSchema for Override this > > > > method and add more type but this method is used inside > CassandraTable > > > too. > > > > > > > > i would like to avoid to re-write fully this adapter :) > > > > > > > > do you have suggestions? > > > > > > > > My second question is : Cassandra is not optimized to have WHERE on > key > > > > not defined on cluster/partition key. I was wondering if calcite > could > > > play > > > > a role without this mechanism to improve performance > > > > > > > > > > > > Thank ! > > > > > > > > Yanna > > > >
CsvAdapter (Json content) from String / InputStream
Hi guys , I saw in the code that this nice adapter makes it possible to make SQL queries on the data JSON https://github.com/apache/calcite/tree/ab71c4cae5a5c3c7d979337a2d38ddaf271aa206/example/csv/src/main/java/org/apache/calcite/adapter/csv But it's limited on File / URL. JsonTable constructor accepte only a Source object and this Source object can be construct only accross a File / URL. it could be nice to have the possibility to make this source from ImputStream too . Creating a temp-file from an InputStream or String can be excesive. Thanks
Re: CsvAdapter (Json content) from String / InputStream
Big Thank Guys !!! Even if this CSVAdapter is an "example" its still really usefull :) and with this update it will be a good "out of box" tools usable in many scenario/workflow Yanna Le jeu. 14 nov. 2019 à 21:20, Andrei Sereda a écrit : > > I see that this feature request relates to Source.java and Sources.java, > which are in org.apache.calcite.util in core. > I'm not planning to change Source.java it already exposes Reader / > InputStream > > > If you add some capability, it is fine to add It to the CSV adapter > example, but it is much more important that that capability exists in the > file adapter. > I will add to both. The general idea behind this change is that currently > CSV / File Adapter(s) require inputs to be File(s) or URL(s) which forces > users to create temporal resources manually (when their content is already > in-memory). If input to CSV / File adapter(s) is generic Readable [1] / > Reader [2] or InputStream it gives more flexibility to users. > > [1] https://docs.oracle.com/javase/7/docs/api/java/lang/Readable.html > [2] https://docs.oracle.com/javase/7/docs/api/java/io/Reader.html > > > On Thu, Nov 14, 2019 at 2:45 PM Julian Hyde wrote: > > > I see that this feature request relates to Source.java and Sources.java, > > which are in org.apache.calcite.util in core. > > > > If you add some capability, it is fine to add It to the CSV adapter > > example, but it is much more important that that capability exists in the > > file adapter. > > > > Julian > > > > > > > On Nov 14, 2019, at 11:36 AM, Andrei Sereda wrote: > > > > > > I think the change is straightforward (will not add complexity). > > > > > > On Thu, Nov 14, 2019 at 1:24 PM Julian Hyde wrote: > > > > > >> Remember that CsvAdapter is in the “example” module. Keep it simple. > > >> > > >> The file adapter can also parse CSV files. > > >> > > >> Julian > > >> > > >> > > >> > > >>> On Nov 14, 2019, at 9:40 AM, Andrei Sereda wrote: > > >>> > > >>> Hello, > > >>> > > >>> Source object already exposes Reader / InputStream API. Probably > > >>> JsonEnumerator can be changed to use those methods. > > >>> > > >>> Do you mind creating a JIRA ticket ? I'll take a look. > > >>> > > >>> Thanks, > > >>> Andrei. > > >>> > > >>> On Thu, Nov 14, 2019 at 7:45 AM Yanna elina < > > yannaelinasul...@gmail.com> > > >>> wrote: > > >>> > > >>>> Hi guys , > > >>>> I saw in the code that this nice adapter makes it possible to make > SQL > > >>>> queries on the data JSON > > >>>> > > >>>> > > >> > > > https://github.com/apache/calcite/tree/ab71c4cae5a5c3c7d979337a2d38ddaf271aa206/example/csv/src/main/java/org/apache/calcite/adapter/csv > > >>>> > > >>>> But it's limited on File / URL. > > >>>> JsonTable constructor accepte only a Source object and this Source > > >> object > > >>>> can be construct only accross a File / URL. > > >>>> > > >>>> it could be nice to have the possibility to make this source from > > >>>> ImputStream too . > > >>>> > > >>>> Creating a temp-file from an InputStream or String can be excesive. > > >>>> > > >>>> Thanks > > >>>> > > >> > > >> > > > > >
Re: CsvAdapter (Json content) from String / InputStream
Hi guys , i have another question about this adapter. I have a json sample like this : [ { "field": "test", "properties": [ "a", "b", "c" ] } ] JsonTable look like to convert Array to OTHER ColumType . the column "properties" is converted on ' when i check JsonEnumerator i can see this function code :*RelDataType type = typeFactory.createJavaType(jsonFieldMap.get(key).getClass());* *About WHERE / INis it supposed to work? on JavaType(ArrayList) ? * if i try to make this query "SELECT * FROM TABLE_A WHERE properties=any('a') i will have alway an Exception : org.apache.calcite.sql.validate.SqlValidatorException: Values passed to IN operator must have compatible type *if its not supposed to work i guess i need to re-implement this one to convert *JAVA.UTIL.ARRAYLIST on SQL_ARRAY type right ? thank Le lun. 18 nov. 2019 à 19:15, Yanna elina a écrit : > Big Thank Guys !!! > > Even if this CSVAdapter is an "example" its still really usefull :) and > with this update it will be a good "out of box" tools usable in many > scenario/workflow > > > Yanna > > Le jeu. 14 nov. 2019 à 21:20, Andrei Sereda a écrit : > >> > I see that this feature request relates to Source.java and Sources.java, >> which are in org.apache.calcite.util in core. >> I'm not planning to change Source.java it already exposes Reader / >> InputStream >> >> > If you add some capability, it is fine to add It to the CSV adapter >> example, but it is much more important that that capability exists in the >> file adapter. >> I will add to both. The general idea behind this change is that currently >> CSV / File Adapter(s) require inputs to be File(s) or URL(s) which forces >> users to create temporal resources manually (when their content is already >> in-memory). If input to CSV / File adapter(s) is generic Readable [1] / >> Reader [2] or InputStream it gives more flexibility to users. >> >> [1] https://docs.oracle.com/javase/7/docs/api/java/lang/Readable.html >> [2] https://docs.oracle.com/javase/7/docs/api/java/io/Reader.html >> >> >> On Thu, Nov 14, 2019 at 2:45 PM Julian Hyde wrote: >> >> > I see that this feature request relates to Source.java and Sources.java, >> > which are in org.apache.calcite.util in core. >> > >> > If you add some capability, it is fine to add It to the CSV adapter >> > example, but it is much more important that that capability exists in >> the >> > file adapter. >> > >> > Julian >> > >> > >> > > On Nov 14, 2019, at 11:36 AM, Andrei Sereda wrote: >> > > >> > > I think the change is straightforward (will not add complexity). >> > > >> > > On Thu, Nov 14, 2019 at 1:24 PM Julian Hyde wrote: >> > > >> > >> Remember that CsvAdapter is in the “example” module. Keep it simple. >> > >> >> > >> The file adapter can also parse CSV files. >> > >> >> > >> Julian >> > >> >> > >> >> > >> >> > >>> On Nov 14, 2019, at 9:40 AM, Andrei Sereda >> wrote: >> > >>> >> > >>> Hello, >> > >>> >> > >>> Source object already exposes Reader / InputStream API. Probably >> > >>> JsonEnumerator can be changed to use those methods. >> > >>> >> > >>> Do you mind creating a JIRA ticket ? I'll take a look. >> > >>> >> > >>> Thanks, >> > >>> Andrei. >> > >>> >> > >>> On Thu, Nov 14, 2019 at 7:45 AM Yanna elina < >> > yannaelinasul...@gmail.com> >> > >>> wrote: >> > >>> >> > >>>> Hi guys , >> > >>>> I saw in the code that this nice adapter makes it possible to make >> SQL >> > >>>> queries on the data JSON >> > >>>> >> > >>>> >> > >> >> > >> https://github.com/apache/calcite/tree/ab71c4cae5a5c3c7d979337a2d38ddaf271aa206/example/csv/src/main/java/org/apache/calcite/adapter/csv >> > >>>> >> > >>>> But it's limited on File / URL. >> > >>>> JsonTable constructor accepte only a Source object and this Source >> > >> object >> > >>>> can be construct only accross a File / URL. >> > >>>> >> > >>>> it could be nice to have the possibility to make this source from >> > >>>> ImputStream too . >> > >>>> >> > >>>> Creating a temp-file from an InputStream or String can be excesive. >> > >>>> >> > >>>> Thanks >> > >>>> >> > >> >> > >> >> > >> > >> >
Re: CSVAdapter OTHER TYPE
*Hi Danny , * *From my side with your suggestion i have this error message : for MAP type : * *SELECT properties['fullyQualifiedDomainName'] FROM TABLE_TEST : * Cannot apply 'ITEM' to arguments of type 'ITEM(, )'. Supported form(s): [] [] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) *and for LIST TYPE : * *SELECT MY_ARRAY[1] FROM TABLE_TEST :* Caused by: org.apache.calcite.sql.validate.SqlValidatorException: Cannot apply 'ITEM' to arguments of type 'ITEM(, )'. Supported form(s): [] Regard , Yanna Le mar. 26 nov. 2019 à 12:19, Steve Robert a écrit : > Ok i got it , > > thank a lot Danny > > Le mar. 26 nov. 2019 à 01:53, Danny Chan a écrit : > > > Thanks, Steve ~ > > > > Calcite did support LIST type syntax, such as “INT ARRAY” or “INT > MULTISET > > ARRAY”, but we do not support MAP syntax yet, because it is not a SQL > > standard, so the MAP type would always parsed to other type. > > > > You can reference a ARRAY column item with syntax “column_name[_index]”, > > and MAP column value with syntax “column_name[‘key_name’]”, these syntax > > also works with “OTHER”/“ANY” data type. > > > > Best, > > Danny Chan > > 在 2019年11月25日 +0800 PM8:22,Steve Robert > >,写道: > > > Hi , > > > > > > I saw an interesting exchange about the CSVAdapter > > > > > > I am also in a similar case where I want to be able to perform > predicate > > > filters from table > > > after conversion from JSON to JsonScannableTable . > > > > > > > > > the first dimension are correctly converted into the right type VARCHAR > > ect > > > but other dimension like LIST or MAP are converted to "OTHER"-type > > > it is therefore impossible to execute CONTAINS / IN OR ANY on column > > > transformed to OTHER-TYPE > > > > > > > > > I would like to know where it is the code best place to convert LIST > MAP > > > to a SQL type ARRAY_LIST and no OTHER > > > > > > I am new in calcite I have a little trouble understanding the type > > > definition workflow. I looked at the side of JSONEnumerator but it > seems > > > opaque. > > > > > > thank you in advance > > > > > > Steve > > >