[jira] [Updated] (OOZIE-3641) preserve schema order in during SchemaService::loadSchema to allow dependencies across schemas

2021-10-28 Thread Jan Filipiak (Jira)


 [ 
https://issues.apache.org/jira/browse/OOZIE-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Filipiak updated OOZIE-3641:

Affects Version/s: 5.2.1

> preserve schema order in during SchemaService::loadSchema to allow 
> dependencies across schemas
> --
>
> Key: OOZIE-3641
> URL: https://issues.apache.org/jira/browse/OOZIE-3641
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: 5.2.1
>Reporter: Jan Filipiak
>Priority: Trivial
>
> Schemas loaded by the schema services are taken from the configuration and 
> then placed into a HashSet before passing them to the schemaFactory. This 
> might cause schemas with dependencies between each other not to be loaded in 
> the correct order and prevents start from the oozie server
>  
> https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/service/SchemaService.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (OOZIE-3641) preserve schema order in during SchemaService::loadSchema to allow dependencies across schemas

2021-10-28 Thread Jan Filipiak (Jira)
Jan Filipiak created OOZIE-3641:
---

 Summary: preserve schema order in during SchemaService::loadSchema 
to allow dependencies across schemas
 Key: OOZIE-3641
 URL: https://issues.apache.org/jira/browse/OOZIE-3641
 Project: Oozie
  Issue Type: Improvement
Reporter: Jan Filipiak


Schemas loaded by the schema services are taken from the configuration and then 
placed into a HashSet before passing them to the schemaFactory. This might 
cause schemas with dependencies between each other not to be loaded in the 
correct order and prevents start from the oozie server

 

https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/service/SchemaService.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (OOZIE-2457) Oozie log parsing regex consume more than 90% cpu

2017-03-29 Thread Jan Filipiak (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947614#comment-15947614
 ] 

Jan Filipiak commented on OOZIE-2457:
-

Hi, are there any plans to change the Logging implementation entirely? Maybe 
keep a log per Bundle/Coord/Job? would a patch be welcome to maybe store those 
logs in hdfs with configurable retention times?

Don't get me wrong but I just ran into this issue today and couldn't really 
believe it.

> Oozie log parsing regex consume more than 90% cpu
> -
>
> Key: OOZIE-2457
> URL: https://issues.apache.org/jira/browse/OOZIE-2457
> Project: Oozie
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
>Assignee: Satish Subhashrao Saley
>Priority: Blocker
> Fix For: 5.0.0
>
> Attachments: OOZIE-2457-1.patch, OOZIE-2457-2.patch, 
> OOZIE-2457-3.patch, OOZIE-2457-4.patch, OOZIE-2457-5.patch, OOZIE-2457-6.patch
>
>
> http-0.0.0.0-4080-26  TID=62215  STATE=RUNNABLE  CPU_TIME=1992 (92.59%)  
> USER_TIME=1990 (92.46%) Allocted: 269156584
> java.util.regex.Pattern$Curly.match0(Pattern.java:4170)
> java.util.regex.Pattern$Curly.match(Pattern.java:4132)
> java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
> java.util.regex.Matcher.match(Matcher.java:1221)
> java.util.regex.Matcher.matches(Matcher.java:559)
> org.apache.oozie.util.XLogFilter.matches(XLogFilter.java:136)
> 
> org.apache.oozie.util.TimestampedMessageParser.parseNextLine(TimestampedMessageParser.java:145)
> 
> org.apache.oozie.util.TimestampedMessageParser.increment(TimestampedMessageParser.java:92)
> Regex 
> {code}
> (.* USER\[[^\]]*\] GROUP\[[^\]]*\] TOKEN\[[^\]]*\] APP\[[^\]]*\] 
> JOB\[000-150625114739728-oozie-puru-W\] ACTION\[[^\]]*\] .*)
> {code}
> For single line parsing we use two regex.
> 1. 
> {code}
> public ArrayList splitLogMessage(String logLine) {
> Matcher splitter = SPLITTER_PATTERN.matcher(logLine);
> if (splitter.matches()) {
> ArrayList logParts = new ArrayList();
> logParts.add(splitter.group(1));// timestamp
> logParts.add(splitter.group(2));// log level
> logParts.add(splitter.group(3));// Log Message
> return logParts;
> }
> else {
> return null;
> }
> }
> {code}
> 2.
> {code}
>  public boolean matches(ArrayList logParts) {
> if (getStartDate() != null) {
> if (logParts.get(0).substring(0, 
> 19).compareTo(getFormattedStartDate()) < 0) {
> return false;
> }
> }
> String logLevel = logParts.get(1);
> String logMessage = logParts.get(2);
> if (this.logLevels == null || 
> this.logLevels.containsKey(logLevel.toUpperCase())) {
> Matcher logMatcher = filterPattern.matcher(logMessage);
> return logMatcher.matches();
> }
> else {
> return false;
> }
> }
> {code}
> Also there is repetitive parsing  for same log message in
> {code}
> private String parseTimestamp(String line) {
> String timestamp = null;
> ArrayList logParts = filter.splitLogMessage(line);
> if (logParts != null) {
> timestamp = logParts.get(0);
> }
> return timestamp;
> }
> {code}
> where the {{line}} has already parsed using regex and we already know the 
> {{logParts}} if any.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)