Hi Charles,

Thanks much for the feedback. I'll take a look.

A quick look at your config suggests that the timestamp might be the issue. As 
I recall, there were no such tests in the unit test class. So, perhaps 
something slipped through. (We should add a test for this case.)


In EVF, we use the Joda (not Java 8) date/time classes. [1] (We do this for 
obscure reasons related to how Drill handles intervals, and the fact that the 
Java 8 date/time classes are not a full replacement for Joda.)

With Joda, your format should be: "MMM dd yyyy HH:mm:ss" (Note the upper case 
"H"). Try this to see if it gets you unstuck.

What we should really do is support SQL format strings. These are not standard, 
but the Postgres format seem common [2]. Someone added this feature to Drill a 
while back, so we must have a Postgres-to-Joda format converter in the code 
somewhere we could use.

Thanks,
- Paul


[1] 
https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html

[2] https://www.postgresql.org/docs/9.1/functions-formatting.html


 

    On Tuesday, July 16, 2019, 02:23:50 PM PDT, Charles Givre 
<cgi...@gmail.com> wrote:  
 
 
Hello All, 
First, a big thank you Paul for updating the log regex reader to the new EVF 
framework.  I am having a little trouble getting it to work however...
Here is my config:

,
    "ssdlog": {
      "type": "logRegex",
      "regex": 
"(\\w{3}\\s\\d{1,2}\\s\\d{4}\\s\\d{2}:\\d{2}:\\d{2})\\s+(\\w+)\\[(\\d+)\\]:\\s(.*?(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?)",
      "extension": "ssdlog",
      "maxErrors": 10,
      "schema": [
          {"fieldName":"eventDate"}
          ]
    },

This works if I leave the schema null, however if I attempt to populate it, I 
get JSON errors.  This was what I originally had:

"schema" : [ {
        "fieldName" : "eventDate",
        "fieldType" : "TIMESTAMP",
        "format" : "MMM dd yyyy hh:mm:ss"
      }, {
        "fieldName" : "process_name"
      }, {
        "fieldName" : "pid",
        "fieldType" : "INT"
      }, {
        "fieldName" : "message"
      }, {
        "fieldName" : "src_ip"
      } ]

which worked.  


Also, I am working on updating a few format plugins and kept getting the 
following error when I try to run unit tests:

at org.apache.drill.test.ClusterFixture.<init>(ClusterFixture.java:152)
    at 
org.apache.drill.test.ClusterFixtureBuilder.build(ClusterFixtureBuilder.java:283)
    at org.apache.drill.test.ClusterTest.startCluster(ClusterTest.java:83)
    at 
org.apache.drill.exec.store.excel.TestExcelFormat.setup(TestExcelFormat.java:49)
Caused by: com.typesafe.config.ConfigException$Missing: No configuration 
setting found for key 'drill.exec.grace_period_ms'
    at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:115)
    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:136)
    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:150)
    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:155)
    at 
com.typesafe.config.impl.SimpleConfig.getConfigNumber(SimpleConfig.java:170)
    at com.typesafe.config.impl.SimpleConfig.getInt(SimpleConfig.java:181)
    at org.apache.drill.common.config.NestedConfig.getInt(NestedConfig.java:96)
    at org.apache.drill.common.config.DrillConfig.getInt(DrillConfig.java:44)
    at org.apache.drill.common.config.NestedConfig.getInt(NestedConfig.java:96)
    at org.apache.drill.common.config.DrillConfig.getInt(DrillConfig.java:44)
    at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:160)
    at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:138)
    at 
org.apache.drill.test.ClusterFixture.startDrillbits(ClusterFixture.java:228)
    at org.apache.drill.test.ClusterFixture.<init>(ClusterFixture.java:146)
    ... 3 more


Process finished with exit code 255

I understand that I have to set the variable drill.exec.grace_period_ms, but 
I'm not sure how/where to do this.  Here is the beginning of my unit test code:

@ClassRule
public static final BaseDirTestWatcher dirTestWatcher = new 
BaseDirTestWatcher();

@BeforeClass
public static void setup() throws Exception {
  
ClusterTest.startCluster(ClusterFixture.builder(dirTestWatcher).maxParallelization(1));
  definePlugin();
}

private static void definePlugin() throws ExecutionSetupException {
  ExcelFormatConfig sampleConfig = new ExcelFormatConfig();

  // Define a temporary plugin for the "cp" storage plugin.
  Drillbit drillbit = cluster.drillbit();
  final StoragePluginRegistry pluginRegistry = 
drillbit.getContext().getStorage();
  final FileSystemPlugin plugin = (FileSystemPlugin) 
pluginRegistry.getPlugin("cp");
  final FileSystemConfig pluginConfig = (FileSystemConfig) plugin.getConfig();
  pluginConfig.getFormats().put("sample", sampleConfig);
  pluginRegistry.createOrUpdate("cp", pluginConfig, false);
}

@Test
public void testStarQuery() throws RpcException {
  String sql = "SELECT * FROM cp.`excel/test_data.xlsx` LIMIT 5";

  RowSet results = client.queryBuilder().sql(sql).rowSet();
  TupleMetadata expectedSchema = new SchemaBuilder()
          .add("id", TypeProtos.MinorType.FLOAT8, TypeProtos.DataMode.OPTIONAL)
          .add("first__name", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
          .add("last__name", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
          .add("email", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
          .add("gender", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
          .add("birthdate", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
          .add("balance", TypeProtos.MinorType.FLOAT8, 
TypeProtos.DataMode.OPTIONAL)
          .add("order__count", TypeProtos.MinorType.FLOAT8, 
TypeProtos.DataMode.OPTIONAL)
          .add("average__order", TypeProtos.MinorType.FLOAT8, 
TypeProtos.DataMode.OPTIONAL)
          .buildSchema();

  RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
          .addRow(1.0, "Cornelia", "Matej", "cmat...@mtv.com", "Female", 
"10/31/1974", 735.29, 22.0, 33.42227273)
          .addRow(2.0, "Nydia", "Heintsch", "nheints...@godaddy.com", "Female", 
"12/10/1966", 784.14, 22.0, 35.64272727)
          .addRow(3.0, "Waiter", "Sherel", "wsher...@utexas.edu", "Male", 
"3/12/1961", 172.36, 17.0, 10.13882353)
          .addRow(4.0, "Cicely", "Lyver", "clyv...@mysql.com", "Female", 
"5/4/2000", 987.39, 6.0, 164.565)
          .addRow(5.0, "Dorie", "Doe", "dd...@spotify.com", "Female", 
"12/28/1955", 852.48, 17.0, 50.14588235)
          .build();

  new RowSetComparison(expected).verifyAndClearAll(results);
}

Thanks!
-C




  

Reply via email to