[ 
https://issues.apache.org/jira/browse/KUDU-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated KUDU-1882:
-----------------------------
    Description: 
The RegexpKuduOperationsProducer currently has the following configuration 
options that could be improved:


||Property Name || Default || Required? || Description ||
| {{producer.skipMissingColumn}} | {{false}} | No | What to do if a column in 
the Kudu table has no corresponding capture group. If set to true, a warning 
message is logged and the operation is still attempted. If set to false, an 
exception is thrown and the sink will not process the Event, causing a Flume 
Channel rollback.|
| {{producer.skipBadColumnValue}} | {{false}} | No | What to do if a value in 
the pattern match cannot be coerced to the required type. If set to true, a 
warning message is logged and the operation is still attempted. If set to 
false, an exception is thrown and the sink will not process the Event, causing 
a Flume Channel rollback. |
| {{producer.warnUnmatchedRows}} | {{true}} | No | Whether to log a warning 
about payloads that do not match the pattern. If set to false, event bodies 
with no matches will be silently dropped. |

It would be an improvement if each of these concepts had the the following 
options: {{warn}}, {{ignore}}, {{reject}}

Where {{warn}} would log a warning to the Flume log and continue processing, 
{{ignore}} would attempt to continue processing without issuing a warning, and 
{{reject}} would throw an exception.

It may be that some fields are nullable or have defaults, potentially due to an 
ALTER TABLE, and we don't want to fill the Flume logs with useless warnings. 
Users may also want to reject any Events that don't match their regex so they 
can correct the configuration and restart Flume without losing those Events.

  was:
The RegexpKuduOperationsProducer currently has the following configuration 
options that could be improved:


||Property Name || Default || Required? || Description ||
| producer.skipMissingColumn     | false | No | What to do if a column in the 
Kudu table has no corresponding capture group. If set to true, a warning 
message is logged and the operation is still attempted. If set to false, an 
exception is thrown and the sink will not process the Event, causing a Flume 
Channel rollback.|
| producer.skipBadColumnValue | false | No | What to do if a value in the 
pattern match cannot be coerced to the required type. If set to true, a warning 
message is logged and the operation is still attempted. If set to false, an 
exception is thrown and the sink will not process the Event, causing a Flume 
Channel rollback. |
| producer.warnUnmatchedRows | true | No | Whether to log a warning about 
payloads that do not match the pattern. If set to false, event bodies with no 
matches will be silently dropped. |

It would be an improvement if each of these concepts had the the following 
options: {{warn}}, {{ignore}}, {{reject}}

Where {{warn}} would log a warning to the Flume log and continue processing, 
{{ignore}} would attempt to continue processing without issuing a warning, and 
{{reject}} would throw an exception.

It may be that some fields are nullable or have defaults, potentially due to an 
ALTER TABLE, and we don't want to fill the Flume logs with useless warnings. 
Users may also want to reject any Events that don't match their regex so they 
can correct the configuration and restart Flume without losing those Events.


> Configuration improvements for Flume Kudu Sink regexp operations producer
> -------------------------------------------------------------------------
>
>                 Key: KUDU-1882
>                 URL: https://issues.apache.org/jira/browse/KUDU-1882
>             Project: Kudu
>          Issue Type: Bug
>          Components: flume-sink, integration
>    Affects Versions: 1.2.0
>            Reporter: Mike Percy
>
> The RegexpKuduOperationsProducer currently has the following configuration 
> options that could be improved:
> ||Property Name || Default || Required? || Description ||
> | {{producer.skipMissingColumn}} | {{false}} | No | What to do if a column in 
> the Kudu table has no corresponding capture group. If set to true, a warning 
> message is logged and the operation is still attempted. If set to false, an 
> exception is thrown and the sink will not process the Event, causing a Flume 
> Channel rollback.|
> | {{producer.skipBadColumnValue}} | {{false}} | No | What to do if a value in 
> the pattern match cannot be coerced to the required type. If set to true, a 
> warning message is logged and the operation is still attempted. If set to 
> false, an exception is thrown and the sink will not process the Event, 
> causing a Flume Channel rollback. |
> | {{producer.warnUnmatchedRows}} | {{true}} | No | Whether to log a warning 
> about payloads that do not match the pattern. If set to false, event bodies 
> with no matches will be silently dropped. |
> It would be an improvement if each of these concepts had the the following 
> options: {{warn}}, {{ignore}}, {{reject}}
> Where {{warn}} would log a warning to the Flume log and continue processing, 
> {{ignore}} would attempt to continue processing without issuing a warning, 
> and {{reject}} would throw an exception.
> It may be that some fields are nullable or have defaults, potentially due to 
> an ALTER TABLE, and we don't want to fill the Flume logs with useless 
> warnings. Users may also want to reject any Events that don't match their 
> regex so they can correct the configuration and restart Flume without losing 
> those Events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to