> On Nov 15, 2019, at 1:39 PM, Paul Rogers <par0...@yahoo.com.INVALID> wrote:
> 
> Hi Charles,
> 
> A thought on debugging deserialization is to not do it in a query. Capture 
> the JSON returned from a rest call. Write a simple unit test that 
> deserializes that by itself from a string or file. Deserialization is a bit 
> of a black art, and is really a problem separate from Drill itself.

So dumb non-dev question... How exactly do I do that?  I have SeDe unit 
test(s), but the query in question is failing in the first part of the unit 
test.

@Test
 public void testSerDe() throws Exception {
  String sql = "SELECT COUNT(*) FROM 
http.`/json?lat=36.7201600&lng=-4.4203400&date=2019-10-02`";
  String plan = queryBuilder().sql(sql).explainJson();
  long cnt = queryBuilder().physical(plan).singletonLong();
  assertEquals("Counts should match",1L, cnt);
}


> 
> As it turns out, for my "day job" I'm doing a POC using Drill to query 
> SumoLogic. I took this as an opportunity to fill that gap you mentioned in 
> our book: how to create a storage plugin. See [1]. This is a work in 
> progress, but it has helped me build the planner-side stuff up to the batch 
> reader, after which the work is identical to that for a format plugin.

YES!!  Awesome!   I know it is super involved, but simply documenting it will 
help a lot. Add that to the number of beers I owe you!


> 
> The Sumo API is REST-based, but for now I'm using the clunky REST client 
> available in the Sumo public repo because of some unfortunate details of the 
> Sumo REST service when used for this purpose. (Sumo returns data as a set of 
> key/value pairs, not as a fixed JSON schema. [4])
> 
> Poking around elsewhere, it turns out someone wrote a very simple Presto 
> connector for REST [2] using the Retrofit library from Square [3] which seems 
> very simple to use. If we create a generic REST plugin, we might want to look 
> at how it was done in Presto. Presto requires an up-front schema which 
> Retrofit can provide. Drill, of course, does not require such a schema and so 
> works with ad-hoc schemas, such as the one that Sumo's API provides. 
> 
> Actually, better than using a deserializer would be to use Drill's existing 
> JSON parser to read data directly into value vectors. But, that existing code 
> has lots of tech debt. I've been working on a PR for new version based on 
> EVF, but that is a while off, and won't help us today.
> 
> It is interesting to note that neither the JSON reader, nor a generic REST 
> API would work with the Sumo API because of is structure. I think the JSON 
> reader would read an entire batch of Sumo results as a single record composed 
> of a repeated Map, with elements being the key/value pairs. Not at all ideal.
> 
> So, both the JSON reader, and the REST API, should eventually handle data 
> formats which are generic (name/value pairs) rather than expressed in the 
> structure of JSON objects (as required by Jackson and Retrofit.) That is a 
> topic for later, but is why the Sumo plugin has to be custom to Sumo's API 
> for now.
> 
> 
> Thanks,
> - Paul
> 
> 
> [1] https://github.com/paul-rogers/drill/wiki/Create-a-Storage-Plugin
> 
> [2] https://github.com/prestosql-rocks/presto-rest
> 
> [3] https://square.github.io/retrofit/
> 
> [4] https://help.sumologic.com/APIs/Search-Job-API/About-the-Search-Job-API
> 
> 
> 
> 
> 
>    On Friday, November 15, 2019, 09:04:21 AM PST, Charles Givre 
> <cgi...@gmail.com> wrote:  
> 
> Hi Igor, 
> Thanks for the advice.  I've been doing some digging and am still pretty 
> stuck here.  Can you recommend any techniques about how to debug the Jackson 
> serialization/deserialization?  I added a unit test that serializes a query 
> and then deserializes it and that test fails.  I've tracked this back to a 
> constructor not receiving the plugin config and then throwing a NPE. What I 
> can't seem to figure out is where that is being called from and why.
> 
> Any advice would be greatly appreciated.  Code can be found here: 
> https://github.com/apache/drill/pull/1892 
> <https://github.com/apache/drill/pull/1892>
> Thanks,
> -- C
> 
> 
>> On Oct 12, 2019, at 3:27 AM, Igor Guzenko <ihor.huzenko....@gmail.com> wrote:
>> 
>> Hello Charles,
>> 
>> Looks like you found another new issue. Maybe I explained unclear, but my
>> previous suggestion wasn't about EXPLAIN PLAN construct, but rather:
>> 1)  Use http client like Postman or simply browser to save response of
>> requested rest service into json file
>> 2)  Try to debug reading the file by Drill in order to compare how
>> Calcite's conversion from AST SqlNode to RelNode tree differs for existing
>> dfs storage plugin from same flow in your storage plugin.
>> 
>> From your last email I can figure out that exists another issue with class
>> HttpGroupScan, at some point Drill tried to deserialize json into instance
>> of HttpGroupScan and jackson library didn't find how to do this. Probably
>> you missed some constructor with jackson metadata, for example see in
>> HiveScan operator:
>> 
>> @JsonCreator
>> public HiveScan(@JsonProperty("userName") final String userName,
>>                 @JsonProperty("hiveReadEntry") final HiveReadEntry
>> hiveReadEntry,
>>                 @JsonProperty("hiveStoragePluginConfig") final
>> HiveStoragePluginConfig hiveStoragePluginConfig,
>>                 @JsonProperty("columns") final List<SchemaPath> columns,
>>                 @JsonProperty("confProperties") final Map<String,
>> String> confProperties,
>>                 @JacksonInject final StoragePluginRegistry
>> pluginRegistry) throws ExecutionSetupException {
>>   this(userName,
>>       hiveReadEntry,
>>       (HiveStoragePlugin) pluginRegistry.getPlugin(hiveStoragePluginConfig),
>>       columns,
>>       null, confProperties);
>> }
>> 
>> Kind regards,
>> Igor
>> 
>> 
>> 
>> On Fri, Oct 11, 2019 at 10:53 PM Charles Givre <cgi...@gmail.com 
>> <mailto:cgi...@gmail.com>> wrote:
>> 
>>> Hi Igor,
>>> Thanks for responding.  I'm not sure if this is what you intended, but
>>> looked at the JSON for the Query plans and found something interesting.
>>> For the SELECT * query, I'm getting the following when I try to run the
>>> physical plan that it generates (without modification).  Do you think this
>>> could be a related problem?
>>> 
>>> 
>>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
>>> InvalidDefinitionException: Cannot construct instance of
>>> `org.apache.drill.exec.store.http.HttpGroupScan` (no Creators, like default
>>> construct, exist): cannot deserialize from Object value (no delegate- or
>>> property-based Creator)
>>> at [Source: (String)"{
>>>   "head" : {
>>>     "version" : 1,
>>>     "generator" : {
>>>       "type" : "ExplainHandler",
>>>       "info" : ""
>>>     },
>>>     "type" : "APACHE_DRILL_PHYSICAL",
>>>     "options" : [ ],
>>>     "queue" : 0,
>>>     "hasResourcePlan" : false,
>>>     "resultMode" : "EXEC"
>>>   },
>>>   "graph" : [ {
>>>     "pop" : "http-scan",
>>>     "@id" : 2,
>>>     "scanSpec" : {
>>>       "uri" : "/json?lat=36.7201600&lng=-4.4203400&date=2019-10-02"
>>>     },
>>>     "columns" : [ "`**`" ],
>>>     "storageConfig" : {
>>>       "type" : "http",
>>>   "[truncated 766 chars]; line: 16, column: 5] (through reference chain:
>>> org.apache.drill.exec.physical.PhysicalPlan["graph"]->java.util.ArrayList[0])
>>> 
>>> 
>>> Please, refer to logs for more information.
>>> 
>>> [Error Id: 751b6d05-a631-4eca-9d83-162ab4fa839f on localhost:31010]
>>> 
>>> 
>>>> On Oct 11, 2019, at 12:25 PM, Igor Guzenko <ihor.huzenko....@gmail.com>
>>> wrote:
>>>> 
>>>> Hello Charles,
>>>> 
>>>> You got the error from Apache Calcite at the planning stage while
>>>> converting SQLIdentifier to RexNode. From your stack trace the conversion
>>>> starts here DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:685)
>>> and
>>>> goes to
>>> SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3694). I
>>>> would suggest to save json returned by rest locally as file and debug
>>> same
>>>> trace for query on the json file. So then you can find difference between
>>>> conversion of sql identifier to rel for standart json reading and for
>>> your
>>>> storage plugin.
>>>> 
>>>> Thanks, Igor
>>>> 
>>>> 
>>>> On Fri, Oct 11, 2019 at 6:34 PM Charles Givre <cgi...@gmail.com 
>>>> <mailto:cgi...@gmail.com> <mailto:
>>> cgi...@gmail.com>> wrote:
>>>> 
>>>>> Hello all,
>>>>> I decided to take the leap and attempt to implement a storage plugin.  I
>>>>> found that a few people had started this, so I thought I'd complete a
>>>>> simple generic HTTP/REST storage plugin. The use case would be to enrich
>>>>> data sets with data that's available via public or internal APIs.
>>>>> 
>>>>> Anyway, I'm a little stuck and need some assistance.  i got the plugin
>>> to
>>>>> successfully execute a star query and return the results correctly:
>>>>> 
>>>>> apache drill> SELECT * FROM
>>>>> http.`/json?lat=36.7201600&lng=-4.4203400&date=2019-10-02`;
>>>>> 
>>>>> 
>>> +------------+------------+-------------+------------+----------------------+--------------------+-------------------------+-----------------------+-----------------------------+---------------------------+
>>>>> |  sunrise  |  sunset  | solar_noon  | day_length |
>>>>> civil_twilight_begin | civil_twilight_end | nautical_twilight_begin |
>>>>> nautical_twilight_end | astronomical_twilight_begin |
>>>>> astronomical_twilight_end |
>>>>> 
>>>>> 
>>> +------------+------------+-------------+------------+----------------------+--------------------+-------------------------+-----------------------+-----------------------------+---------------------------+
>>>>> | 6:13:58 AM | 5:59:55 PM | 12:06:56 PM | 11:45:57  | 5:48:14 AM
>>>>> | 6:25:38 PM        | 5:18:16 AM              | 6:55:36 PM            |
>>>>> 4:48:07 AM                  | 7:25:45 PM                |
>>>>> 
>>>>> 
>>> +------------+------------+-------------+------------+----------------------+--------------------+-------------------------+-----------------------+-----------------------------+---------------------------+
>>>>> 1 row selected (0.392 seconds)
>>>>> 
>>>>> However, when I attempt to select individual fields i get errors.  (see
>>>>> below for full stack trace).  I've walked through this with the
>>> debugger,
>>>>> but it seems like the code is breaking before it hits my storage plugin
>>> and
>>>>> I'm not sure what to do about it.  Here's a link to the code:
>>>>> https://github.com/cgivre/drill/tree/storage-http/contrib/storage-http 
>>>>> <https://github.com/cgivre/drill/tree/storage-http/contrib/storage-http>
>>> <
>>>>> https://github.com/cgivre/drill/tree/storage-http/contrib/storage-http
>>> <https://github.com/cgivre/drill/tree/storage-http/contrib/storage-http>>
>>>>> 
>>>>> Any assistance would be greatly appreciated.  Thanks!!
>>>>> 
>>>>> 
>>>>> 
>>>>> apache drill> !verbose
>>>>> verbose: on
>>>>> apache drill> SELECT sunset FROM
>>>>> http.`/json?lat=36.7201600&lng=-4.4203400&date=2019-10-02`;
>>>>> Error: SYSTEM ERROR: AssertionError: Field ordinal 1 is invalid for
>>> type
>>>>> '(DrillRecordRow[**])'
>>>>> 
>>>>> 
>>>>> Please, refer to logs for more information.
>>>>> 
>>>>> [Error Id: d7bccd2f-73e6-40d7-9b8a-73a772f65c02 on 192.168.1.21:31010]
>>>>> (state=,code=0)
>>>>> java.sql.SQLException: SYSTEM ERROR: AssertionError: Field ordinal 1 is
>>>>> invalid for  type '(DrillRecordRow[**])'
>>>>> 
>>>>> 
>>>>> Please, refer to logs for more information.
>>>>> 
>>>>> [Error Id: d7bccd2f-73e6-40d7-9b8a-73a772f65c02 on 192.168.1.21:31010]
>>>>>       at
>>>>> 
>>> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:538)
>>>>>       at
>>>>> 
>>> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:610)
>>>>>       at
>>>>> 
>>> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1278)
>>>>>       at
>>>>> 
>>> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:58)
>>>>>       at
>>>>> 
>>> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
>>>>>       at
>>>>> 
>>> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1102)
>>>>>       at
>>>>> 
>>> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1113)
>>>>>       at
>>>>> 
>>> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>>>>>       at
>>>>> 
>>> org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:200)
>>>>>       at
>>>>> 
>>> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>>>>>       at
>>>>> 
>>> org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:217)
>>>>>       at sqlline.Commands.executeSingleQuery(Commands.java:1008)
>>>>>       at sqlline.Commands.execute(Commands.java:957)
>>>>>       at sqlline.Commands.sql(Commands.java:921)
>>>>>       at sqlline.SqlLine.dispatch(SqlLine.java:717)
>>>>>       at sqlline.SqlLine.begin(SqlLine.java:536)
>>>>>       at sqlline.SqlLine.start(SqlLine.java:266)
>>>>>       at sqlline.SqlLine.main(SqlLine.java:205)
>>>>> Caused by: org.apache.drill.common.exceptions.UserRemoteException:
>>> SYSTEM
>>>>> ERROR: AssertionError: Field ordinal 1 is invalid for  type
>>>>> '(DrillRecordRow[**])'
>>>>> 
>>>>> 
>>>>> Please, refer to logs for more information.
>>>>> 
>>>>> [Error Id: d7bccd2f-73e6-40d7-9b8a-73a772f65c02 on 192.168.1.21:31010]
>>>>>       at
>>>>> 
>>> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
>>>>>       at
>>>>> org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
>>>>>       at
>>>>> org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
>>>>>       at
>>>>> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273)
>>>>>       at
>>>>> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
>>>>>       at
>>>>> 
>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>>>>>       at
>>>>> 
>>> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>>>>>       at
>>>>> 
>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>>>>>       at
>>>>> 
>>> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
>>>>>       at
>>>>> 
>>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>>>>>       at
>>>>> 
>>> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>>>>>       at
>>>>> 
>>> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>>>>>       at
>>>>> 
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>>>>>       at
>>>>> 
>>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
>>>>>       at
>>>>> 
>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>>>>>       at
>>>>> 
>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
>>>>>       at
>>>>> 
>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
>>>>>       at
>>>>> 
>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
>>>>>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
>>>>>       at
>>>>> 
>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
>>>>>       at java.lang.Thread.run(Thread.java:748)
>>>>> Caused by: org.apache.drill.exec.work.foreman.ForemanException:
>>>>> Unexpected exception during fragment initialization: Field ordinal 1 is
>>>>> invalid for  type '(DrillRecordRow[**])'
>>>>>       at org.apache.drill.exec.work
>>>>> .foreman.Foreman.run(Foreman.java:303)
>>>>>       at .......(:0)
>>>>> Caused by: java.lang.AssertionError: Field ordinal 1 is invalid for
>>> type
>>>>> '(DrillRecordRow[**])'
>>>>>       at
>>>>> org.apache.calcite.rex.RexBuilder.makeFieldAccess(RexBuilder.java:197)
>>>>>       at
>>>>> 
>>> org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3694)
>>>>>       at
>>>>> 
>>> org.apache.calcite.sql2rel.SqlToRelConverter.access$2200(SqlToRelConverter.java:217)
>>>>>       at
>>>>> 
>>> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4765)
>>>>>       at
>>>>> 
>>> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4061)
>>>>>       at
>>>>> org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:317)
>>>>>       at
>>>>> 
>>> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4625)
>>>>>       at
>>>>> 
>>> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3908)
>>>>>       at
>>>>> 
>>> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:670)
>>>>>       at
>>>>> 
>>> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:627)
>>>>>       at
>>>>> 
>>> org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3150)
>>>>>       at
>>>>> 
>>> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:563)
>>>>>       at
>>>>> 
>>> org.apache.drill.exec.planner.sql.SqlConverter.toRel(SqlConverter.java:414)
>>>>>       at
>>>>> 
>>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:685)
>>>>>       at
>>>>> 
>>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:202)
>>>>>       at
>>>>> 
>>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:172)
>>>>>       at
>>>>> 
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:226)
>>>>>       at
>>>>> 
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:124)
>>>>>       at
>>>>> 
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:90)
>>>>>       at org.apache.drill.exec.work
>>>>> .foreman.Foreman.runSQL(Foreman.java:591)
>>>>>       at org.apache.drill.exec.work
>>>>> .foreman.Foreman.run(Foreman.java:276)
>>>>>       ... 1 more
>>>>> apache drill>

Reply via email to