Hi Paul, 
Thanks for the response.  Unfortunately, I tried simply setting a fieldName and 
got an error. 

 "ssdlog": {
      "type": "logRegex",
      "regex": 
"(\\w{3}\\s\\d{1,2}\\s\\d{4}\\s\\d{2}:\\d{2}:\\d{2})\\s+(\\w+)\\[(\\d+)\\]:\\s(.*?(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?)",
      "extension": "ssdlog",
      "maxErrors": 10,
      "schema": [{"fieldName": "test"}]
    },
--C


> On Jul 16, 2019, at 7:08 PM, Paul Rogers <par0...@yahoo.com.INVALID> wrote:
> 
> Hi Charles,
> 
> Thanks much for the feedback. I'll take a look.
> 
> A quick look at your config suggests that the timestamp might be the issue. 
> As I recall, there were no such tests in the unit test class. So, perhaps 
> something slipped through. (We should add a test for this case.)
> 
> 
> In EVF, we use the Joda (not Java 8) date/time classes. [1] (We do this for 
> obscure reasons related to how Drill handles intervals, and the fact that the 
> Java 8 date/time classes are not a full replacement for Joda.)
> 
> With Joda, your format should be: "MMM dd yyyy HH:mm:ss" (Note the upper case 
> "H"). Try this to see if it gets you unstuck.
> 
> What we should really do is support SQL format strings. These are not 
> standard, but the Postgres format seem common [2]. Someone added this feature 
> to Drill a while back, so we must have a Postgres-to-Joda format converter in 
> the code somewhere we could use.
> 
> Thanks,
> - Paul
> 
> 
> [1] 
> https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html
> 
> [2] https://www.postgresql.org/docs/9.1/functions-formatting.html
> 
> 
> 
> 
>    On Tuesday, July 16, 2019, 02:23:50 PM PDT, Charles Givre 
> <cgi...@gmail.com> wrote:  
> 
> 
> Hello All, 
> First, a big thank you Paul for updating the log regex reader to the new EVF 
> framework.  I am having a little trouble getting it to work however...
> Here is my config:
> 
> ,
>     "ssdlog": {
>       "type": "logRegex",
>       "regex": 
> "(\\w{3}\\s\\d{1,2}\\s\\d{4}\\s\\d{2}:\\d{2}:\\d{2})\\s+(\\w+)\\[(\\d+)\\]:\\s(.*?(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?)",
>       "extension": "ssdlog",
>       "maxErrors": 10,
>       "schema": [
>           {"fieldName":"eventDate"}
>           ]
>     },
> 
> This works if I leave the schema null, however if I attempt to populate it, I 
> get JSON errors.  This was what I originally had:
> 
> "schema" : [ {
>         "fieldName" : "eventDate",
>         "fieldType" : "TIMESTAMP",
>         "format" : "MMM dd yyyy hh:mm:ss"
>       }, {
>         "fieldName" : "process_name"
>       }, {
>         "fieldName" : "pid",
>         "fieldType" : "INT"
>       }, {
>         "fieldName" : "message"
>       }, {
>         "fieldName" : "src_ip"
>       } ]
> 
> which worked.  
> 
> 
> Also, I am working on updating a few format plugins and kept getting the 
> following error when I try to run unit tests:
> 
> at org.apache.drill.test.ClusterFixture.<init>(ClusterFixture.java:152)
>     at 
> org.apache.drill.test.ClusterFixtureBuilder.build(ClusterFixtureBuilder.java:283)
>     at org.apache.drill.test.ClusterTest.startCluster(ClusterTest.java:83)
>     at 
> org.apache.drill.exec.store.excel.TestExcelFormat.setup(TestExcelFormat.java:49)
> Caused by: com.typesafe.config.ConfigException$Missing: No configuration 
> setting found for key 'drill.exec.grace_period_ms'
>     at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:115)
>     at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:136)
>     at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
>     at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
>     at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:150)
>     at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:155)
>     at 
> com.typesafe.config.impl.SimpleConfig.getConfigNumber(SimpleConfig.java:170)
>     at com.typesafe.config.impl.SimpleConfig.getInt(SimpleConfig.java:181)
>     at 
> org.apache.drill.common.config.NestedConfig.getInt(NestedConfig.java:96)
>     at org.apache.drill.common.config.DrillConfig.getInt(DrillConfig.java:44)
>     at 
> org.apache.drill.common.config.NestedConfig.getInt(NestedConfig.java:96)
>     at org.apache.drill.common.config.DrillConfig.getInt(DrillConfig.java:44)
>     at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:160)
>     at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:138)
>     at 
> org.apache.drill.test.ClusterFixture.startDrillbits(ClusterFixture.java:228)
>     at org.apache.drill.test.ClusterFixture.<init>(ClusterFixture.java:146)
>     ... 3 more
> 
> 
> Process finished with exit code 255
> 
> I understand that I have to set the variable drill.exec.grace_period_ms, but 
> I'm not sure how/where to do this.  Here is the beginning of my unit test 
> code:
> 
> @ClassRule
> public static final BaseDirTestWatcher dirTestWatcher = new 
> BaseDirTestWatcher();
> 
> @BeforeClass
> public static void setup() throws Exception {
>   
> ClusterTest.startCluster(ClusterFixture.builder(dirTestWatcher).maxParallelization(1));
>   definePlugin();
> }
> 
> private static void definePlugin() throws ExecutionSetupException {
>   ExcelFormatConfig sampleConfig = new ExcelFormatConfig();
> 
>   // Define a temporary plugin for the "cp" storage plugin.
>   Drillbit drillbit = cluster.drillbit();
>   final StoragePluginRegistry pluginRegistry = 
> drillbit.getContext().getStorage();
>   final FileSystemPlugin plugin = (FileSystemPlugin) 
> pluginRegistry.getPlugin("cp");
>   final FileSystemConfig pluginConfig = (FileSystemConfig) plugin.getConfig();
>   pluginConfig.getFormats().put("sample", sampleConfig);
>   pluginRegistry.createOrUpdate("cp", pluginConfig, false);
> }
> 
> @Test
> public void testStarQuery() throws RpcException {
>   String sql = "SELECT * FROM cp.`excel/test_data.xlsx` LIMIT 5";
> 
>   RowSet results = client.queryBuilder().sql(sql).rowSet();
>   TupleMetadata expectedSchema = new SchemaBuilder()
>           .add("id", TypeProtos.MinorType.FLOAT8, 
> TypeProtos.DataMode.OPTIONAL)
>           .add("first__name", TypeProtos.MinorType.VARCHAR, 
> TypeProtos.DataMode.OPTIONAL)
>           .add("last__name", TypeProtos.MinorType.VARCHAR, 
> TypeProtos.DataMode.OPTIONAL)
>           .add("email", TypeProtos.MinorType.VARCHAR, 
> TypeProtos.DataMode.OPTIONAL)
>           .add("gender", TypeProtos.MinorType.VARCHAR, 
> TypeProtos.DataMode.OPTIONAL)
>           .add("birthdate", TypeProtos.MinorType.VARCHAR, 
> TypeProtos.DataMode.OPTIONAL)
>           .add("balance", TypeProtos.MinorType.FLOAT8, 
> TypeProtos.DataMode.OPTIONAL)
>           .add("order__count", TypeProtos.MinorType.FLOAT8, 
> TypeProtos.DataMode.OPTIONAL)
>           .add("average__order", TypeProtos.MinorType.FLOAT8, 
> TypeProtos.DataMode.OPTIONAL)
>           .buildSchema();
> 
>   RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
>           .addRow(1.0, "Cornelia", "Matej", "cmat...@mtv.com", "Female", 
> "10/31/1974", 735.29, 22.0, 33.42227273)
>           .addRow(2.0, "Nydia", "Heintsch", "nheints...@godaddy.com", 
> "Female", "12/10/1966", 784.14, 22.0, 35.64272727)
>           .addRow(3.0, "Waiter", "Sherel", "wsher...@utexas.edu", "Male", 
> "3/12/1961", 172.36, 17.0, 10.13882353)
>           .addRow(4.0, "Cicely", "Lyver", "clyv...@mysql.com", "Female", 
> "5/4/2000", 987.39, 6.0, 164.565)
>           .addRow(5.0, "Dorie", "Doe", "dd...@spotify.com", "Female", 
> "12/28/1955", 852.48, 17.0, 50.14588235)
>           .build();
> 
>   new RowSetComparison(expected).verifyAndClearAll(results);
> }
> 
> Thanks!
> -C
> 
> 
> 
> 

Reply via email to