Two query questions on time bin function

Yiran Wang Fri, 19 Feb 2016 15:40:36 -0800

Hi Asterix team,

I have two queries I'm struggling with. I'm hoping you could provide a
direction for me. Thanks in advance!


Here is what the data structure looks like:

create type HRMType as closed {

  row_id: int32,

  sid: int32,

  date: date,

  day: int32,

  time: time,

  bpm: int32,

  RR: float

};

create dataset HRM (HRMType)

primary key row_id;


Previously I have used the time bin function to calculate the standard
deviation of bpm for each time bin:

for $i in dataset HRM

group by $sid := $i.sid, $gdate := $i.date, $gday := $i.day, $timebin :=
interval-bin($i.time, time("00:00:00"), day-time-duration("PT1M")) with $i

return {

"sid": $sid,

"gdate": $gdate,

"gday": $gday,

"timebin": $timebin,

"stdv": (avg(for $ii in $i return $ii.RR * $ii.RR) - avg(for $ii in $i
return $ii.RR) * avg(for $ii in $i return $ii.RR))^(0.5)};

Now I have two things I am hoping to do but need help with:

1. For each 1-min time bin, remove the bpm values that are above the top 5%
or below the bottom 5%. I thought about using the min/max function for a
few times to achieve this, but realized that it was not a good idea because
in each time bin, the number of instances are not always the same. So for
each 1-min time bin, we do need to calculate the 5% and 95% threshold, and
remove instances accordingly, which I don't know how to do.

2. After removing the outliers of bpm for each 1-min time bin, calculate a
median absolute deviation (MAD) for each 1-min time bin (as another measure
of variation besides the standard deviation). MAD =
median(abs(x-median(x)). I'm not sure how to write a query to do the median
function in Asterix.

Thank you so much in advance. Let me know if my questions are clear.

Yiran

-- 
Best,
Yiran

Two query questions on time bin function

Reply via email to