Re: drill.exec.grace_period_ms' Errors

2019-07-20 Thread Paul Rogers
Hi Charles,

I just ran some unit tests, using master, and did not see the 
drill.exec.grace_period_ms error that you saw.

drill.exec.grace_period_ms is defined in ExecConstants.java, is used in 
Drillbit startup in Drillbit.java, and has a value defined in 
src/main/resources/drill-module.conf.

In other words, it seems everything is set up the way it should be. I wonder, 
do you have an old version of drill-module.conf? If you check your working 
branch do you have any unexpected changes? ("git status"). Also, have you 
grabbed the latest master ranch recently? ("git checkout master; git pull 
apache master; git checkout ; git rebase master". Where "apache" 
is whatever you named your Drill Github remote.

Thanks,
- Paul

 

On Tuesday, July 16, 2019, 6:45:01 PM PDT, Charles Givre  
wrote:  
 
 H> 
> Also, I am working on updating a few format plugins and kept getting the 
> following error when I try to run unit tests:
> 
> at org.apache.drill.test.ClusterFixture.(ClusterFixture.java:152)
>    at 
>org.apache.drill.test.ClusterFixtureBuilder.build(ClusterFixtureBuilder.java:283)
>    at org.apache.drill.test.ClusterTest.startCluster(ClusterTest.java:83)
>    at 
>org.apache.drill.exec.store.excel.TestExcelFormat.setup(TestExcelFormat.java:49)
> Caused by: com.typesafe.config.ConfigException$Missing: No configuration 
> setting found for key 'drill.exec.grace_period_ms'
>    at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:115)
>    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:136)
>    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
>    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
>    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:150)
>    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:155)
>    at 
>com.typesafe.config.impl.SimpleConfig.getConfigNumber(SimpleConfig.java:170)
>    at com.typesafe.config.impl.SimpleConfig.getInt(SimpleConfig.java:181)
>    at org.apache.drill.common.config.NestedConfig.getInt(NestedConfig.java:96)
>    at org.apache.drill.common.config.DrillConfig.getInt(DrillConfig.java:44)
>    at org.apache.drill.common.config.NestedConfig.getInt(NestedConfig.java:96)
>    at org.apache.drill.common.config.DrillConfig.getInt(DrillConfig.java:44)
>    at org.apache.drill.exec.server.Drillbit.(Drillbit.java:160)
>    at org.apache.drill.exec.server.Drillbit.(Drillbit.java:138)
>    at 
>org.apache.drill.test.ClusterFixture.startDrillbits(ClusterFixture.java:228)
>    at org.apache.drill.test.ClusterFixture.(ClusterFixture.java:146)
>    ... 3 more
> 
> 
> Process finished with exit code 255
> 
> I understand that I have to set the variable drill.exec.grace_period_ms, but 
> I'm not sure how/where to do this.  Here is the beginning of my unit test 
> code:
> 
> @ClassRule
> public static final BaseDirTestWatcher dirTestWatcher = new 
> BaseDirTestWatcher();
> 
> @BeforeClass
> public static void setup() throws Exception {
>  
>ClusterTest.startCluster(ClusterFixture.builder(dirTestWatcher).maxParallelization(1));
>  definePlugin();
> }
> 
> private static void definePlugin() throws ExecutionSetupException {
>  ExcelFormatConfig sampleConfig = new ExcelFormatConfig();
> 
>  // Define a temporary plugin for the "cp" storage plugin.
>  Drillbit drillbit = cluster.drillbit();
>  final StoragePluginRegistry pluginRegistry = 
>drillbit.getContext().getStorage();
>  final FileSystemPlugin plugin = (FileSystemPlugin) 
>pluginRegistry.getPlugin("cp");
>  final FileSystemConfig pluginConfig = (FileSystemConfig) plugin.getConfig();
>  pluginConfig.getFormats().put("sample", sampleConfig);
>  pluginRegistry.createOrUpdate("cp", pluginConfig, false);
> }
> 
> @Test
> public void testStarQuery() throws RpcException {
>  String sql = "SELECT * FROM cp.`excel/test_data.xlsx` LIMIT 5";
> 
>  RowSet results = client.queryBuilder().sql(sql).rowSet();
>  TupleMetadata expectedSchema = new SchemaBuilder()
>          .add("id", TypeProtos.MinorType.FLOAT8, TypeProtos.DataMode.OPTIONAL)
>          .add("first__name", TypeProtos.MinorType.VARCHAR, 
>TypeProtos.DataMode.OPTIONAL)
>          .add("last__name", TypeProtos.MinorType.VARCHAR, 
>TypeProtos.DataMode.OPTIONAL)
>          .add("email", TypeProtos.MinorType.VARCHAR, 
>TypeProtos.DataMode.OPTIONAL)
>          .add("gender", TypeProtos.MinorType.VARCHAR, 
>TypeProtos.DataMode.OPTIONAL)
>          .add("birthdate", TypeProtos.MinorType.VARCHAR, 
>TypeProtos.DataMode.OPTIONAL)
>          .add("balance", TypeProtos.MinorType.FLOAT8, 
>TypeProtos.DataMode.OPTIONAL)
>          .add("order__count", TypeProtos.MinorType.FLOAT8, 
>TypeProtos.DataMode.OPTIONAL)
>          .add("average__order", TypeProtos.MinorType.FLOAT8, 
>TypeProtos.DataMode.OPTIONAL)
>          .buildSchema();
> 
>  RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
>          .addRow(1.0, "Cornelia", "Matej", "cmat...@mtv.com", "Female", 

Re: EVF Log Regex Errors

2019-07-20 Thread Paul Rogers
Hi Charles,

Turns out that there are two problems here. First, I mucked up the Jackson 
serialization of the schema objects. Second, you need to use the Joda format 
(with "HH") as we discussed. Once both those changes are made, things seem to 
work (at least in unit tests.)

There is a PR for the fix. Please review.

Thanks,
- Paul

 

On Tuesday, July 16, 2019, 6:45:01 PM PDT, Charles Givre  
wrote:  
 
 Hi Paul, 
Thanks for the response.  Unfortunately, I tried simply setting a fieldName and 
got an error. 

 "ssdlog": {
      "type": "logRegex",
      "regex": 
"(\\w{3}\\s\\d{1,2}\\s\\d{4}\\s\\d{2}:\\d{2}:\\d{2})\\s+(\\w+)\\[(\\d+)\\]:\\s(.*?(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?)",
      "extension": "ssdlog",
      "maxErrors": 10,
      "schema": [{"fieldName": "test"}]
    },
--C


> On Jul 16, 2019, at 7:08 PM, Paul Rogers  wrote:
> 
> Hi Charles,
> 
> Thanks much for the feedback. I'll take a look.
> 
> A quick look at your config suggests that the timestamp might be the issue. 
> As I recall, there were no such tests in the unit test class. So, perhaps 
> something slipped through. (We should add a test for this case.)
> 
> 
> In EVF, we use the Joda (not Java 8) date/time classes. [1] (We do this for 
> obscure reasons related to how Drill handles intervals, and the fact that the 
> Java 8 date/time classes are not a full replacement for Joda.)
> 
> With Joda, your format should be: "MMM dd  HH:mm:ss" (Note the upper case 
> "H"). Try this to see if it gets you unstuck.
> 
> What we should really do is support SQL format strings. These are not 
> standard, but the Postgres format seem common [2]. Someone added this feature 
> to Drill a while back, so we must have a Postgres-to-Joda format converter in 
> the code somewhere we could use.
> 
> Thanks,
> - Paul
> 
> 
> [1] 
> https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html
> 
> [2] https://www.postgresql.org/docs/9.1/functions-formatting.html
> 
> 
> 
> 
>    On Tuesday, July 16, 2019, 02:23:50 PM PDT, Charles Givre 
> wrote:  
> 
> 
> Hello All, 
> First, a big thank you Paul for updating the log regex reader to the new EVF 
> framework.  I am having a little trouble getting it to work however...
> Here is my config:
> 
> ,
>    "ssdlog": {
>      "type": "logRegex",
>      "regex": 
>"(\\w{3}\\s\\d{1,2}\\s\\d{4}\\s\\d{2}:\\d{2}:\\d{2})\\s+(\\w+)\\[(\\d+)\\]:\\s(.*?(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?)",
>      "extension": "ssdlog",
>      "maxErrors": 10,
>      "schema": [
>          {"fieldName":"eventDate"}
>          ]
>    },
> 
> This works if I leave the schema null, however if I attempt to populate it, I 
> get JSON errors.  This was what I originally had:
> 
> "schema" : [ {
>        "fieldName" : "eventDate",
>        "fieldType" : "TIMESTAMP",
>        "format" : "MMM dd  hh:mm:ss"
>      }, {
>        "fieldName" : "process_name"
>      }, {
>        "fieldName" : "pid",
>        "fieldType" : "INT"
>      }, {
>        "fieldName" : "message"
>      }, {
>        "fieldName" : "src_ip"
>      } ]
> 
> which worked.  
> 
> 
> Also, I am working on updating a few format plugins and kept getting the 
> following error when I try to run unit tests:
> 
> at org.apache.drill.test.ClusterFixture.(ClusterFixture.java:152)
>    at 
>org.apache.drill.test.ClusterFixtureBuilder.build(ClusterFixtureBuilder.java:283)
>    at org.apache.drill.test.ClusterTest.startCluster(ClusterTest.java:83)
>    at 
>org.apache.drill.exec.store.excel.TestExcelFormat.setup(TestExcelFormat.java:49)
> Caused by: com.typesafe.config.ConfigException$Missing: No configuration 
> setting found for key 'drill.exec.grace_period_ms'
>    at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:115)
>    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:136)
>    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
>    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
>    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:150)
>    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:155)
>    at 
>com.typesafe.config.impl.SimpleConfig.getConfigNumber(SimpleConfig.java:170)
>    at com.typesafe.config.impl.SimpleConfig.getInt(SimpleConfig.java:181)
>    at org.apache.drill.common.config.NestedConfig.getInt(NestedConfig.java:96)
>    at org.apache.drill.common.config.DrillConfig.getInt(DrillConfig.java:44)
>    at org.apache.drill.common.config.NestedConfig.getInt(NestedConfig.java:96)
>    at org.apache.drill.common.config.DrillConfig.getInt(DrillConfig.java:44)
>    at org.apache.drill.exec.server.Drillbit.(Drillbit.java:160)
>    at org.apache.drill.exec.server.Drillbit.(Drillbit.java:138)
>    at 
>org.apache.drill.test.ClusterFixture.startDrillbits(ClusterFixture.java:228)
>    at org.apache.drill.test.ClusterFixture.(ClusterFixture.java:146)
>    ... 3 more
> 
> 
> Process finished with exit code 255
> 
> I understand 

[GitHub] [drill] paul-rogers opened a new pull request #1827: DRILL-7327: Log Regex Plugin Won't Recognize Schema

2019-07-20 Thread GitBox
paul-rogers opened a new pull request #1827: DRILL-7327: Log Regex Plugin Won't 
Recognize Schema
URL: https://github.com/apache/drill/pull/1827
 
 
   The previous commit revised the plugin config classes to work
   with table functions. That caused Jackson to stop working for
   the classess. Fixed those issues and added unit tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (DRILL-7327) Log Regex Plugin Won't Recognize Schema

2019-07-20 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-7327.

Resolution: Not A Bug

> Log Regex Plugin Won't Recognize Schema
> ---
>
> Key: DRILL-7327
> URL: https://issues.apache.org/jira/browse/DRILL-7327
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Assignee: Paul Rogers
>Priority: Major
> Attachments: firewall.ssdlog
>
>
> When I attempt to create a define a schema for the new `logRegex` plugin, 
> Drill does not recognize the plugin if the configuration includes a schema.
> {code:json}
> {,
> "ssdlog": {
>   "type": "logRegex",
>   "regex": 
> "(\\w{3}\\s\\d{1,2}\\s\\d{4}\\s\\d{2}:\\d{2}:\\d{2})\\s+(\\w+)\\[(\\d+)\\]:\\s(.*?(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?)",
>   "extension": "ssdlog",
>   "maxErrors": 10,
>   "schema": []
> }
> {code}
> This configuration works, however, this does not:
> {code:json}
> {,
> "ssdlog": {
>   "type": "logRegex",
>   "regex": 
> "(\\w{3}\\s\\d{1,2}\\s\\d{4}\\s\\d{2}:\\d{2}:\\d{2})\\s+(\\w+)\\[(\\d+)\\]:\\s(.*?(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?)",
>   "extension": "ssdlog",
>   "maxErrors": 10,
>   "schema": [
> {"fieldName":"eventDate"}
> ]
> }
> {code}
> [~paul-rogers]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)