I'm a data junkie - interesting none the less
-
No new posts
Moderators: k9car363, alicefoeller | Reply |
|
2009-10-20 3:00 PM |
2009-10-20 3:25 PM in reply to: #2469487 |
Expert 1087 Portland | Subject: RE: I'm a data junkie - interesting none the less As a fellow data junkie, this is very interesting! It would be very interesting to do an analysis of professional only athletes, to see how much the swim compares to their overall finish place. |
2009-10-20 3:29 PM in reply to: #2469551 |
Cycling Guru 15134 Fulton, MD | Subject: RE: I'm a data junkie - interesting none the less I've faked my way through sprints, olympics and halfs with swimming. And while you cannot win the race on the swim, you can certainly lose it! I've missed out on overall and podium spots because I swim for sh-t. But I'm a strong cyclist and runner. Triathlon is one sport. Swimming is just a part of it that determines how well you do overall. |
2009-10-20 3:35 PM in reply to: #2469487 |
Master 1736 Midcoast Maine | Subject: RE: I'm a data junkie - interesting none the less As a mathematician - this is awesome! Now I can stop sweating the swim. ;-) Thanks for doing this. (Although, like the previous poster, I would be interested in how it stacks up for the elite/pros) |
2009-10-20 3:36 PM in reply to: #2469487 |
Elite 5316 Alturas, California | Subject: RE: I'm a data junkie - interesting none the less Yes, if there is a lot of junk in the data you can try subgroups of data. One might expect the bottom 50% to have a great deal of variability in which sport is their strongest, but as move up from there to the 60th, 70th, 80, 90th, 5th and 1st percentile you find that there is nonlinear relationship across the overall finish progressions. |
2009-10-20 3:36 PM in reply to: #2469487 |
Elite 2645 Phoenix, AZ | Subject: RE: I'm a data junkie - interesting none the less As an data-driven UI engineer I'm impressed with your work. As a swimmer, I'm beyond pissed. Edit: Which xy plot is the run in your long course data? There are 2 labeled "bike." Edited by Slidell4life 2009-10-20 3:39 PM |
|
2009-10-20 4:14 PM in reply to: #2469487 |
Extreme Veteran 430 Madison, WI | Subject: RE: I'm a data junkie - interesting none the less Let's get some least squares regression analysis up in this biznatch! Well I suppose it would help if I actually looked at the data before I asked for something that was already there. Great work! Edited by tetchypoo 2009-10-20 4:15 PM |
2009-10-20 4:16 PM in reply to: #2469487 |
2009-10-20 4:18 PM in reply to: #2469487 |
Extreme Veteran 430 Madison, WI | Subject: RE: I'm a data junkie - interesting none the less What's really interesting about that is that the R2 is more significant for the HIM. What I'm guessing that means is that if you plotted that at an IM distance, you'd probably see better correlation. Still, bike and run are much better fits than swim. |
2009-10-20 4:21 PM in reply to: #2469580 |
Elite 4048 Gilbert, Az. | Subject: RE: I'm a data junkie - interesting none the less Slidell4life - 2009-10-20 1:36 PM As a swimmer, I'm beyond pissed. x2 (Except I'm more exasperated/resigned). There are going to be a lot of strong bikers/runners that can make up some for a lousy swim, but I would almost guarantee that the top 5 in each age group are no worse than top 10 in the swim in their AG for sprint/oly. You have more chance of outliers the longer the distance. This looks like an attempt to create a personal excuse of "why I don't need to swim more to be competitive fool myself into thinking I'm competitive" Oh, and I don't find a Kirk in the triathlon results... John Edited by tkd.teacher 2009-10-20 4:25 PM |
2009-10-20 4:24 PM in reply to: #2469678 |
Extreme Veteran 430 Madison, WI | Subject: RE: I'm a data junkie - interesting none the less tkd.teacher - 2009-10-20 4:21 PM If I may...I think it speaks more to the disproportion of swimming as compared to bike and run, with respect to distance. This looks like an attempt to create a personal excuse of "why I don't need to swim more to be competitive fool myself into thinking I'm competitive" John Personally, it makes me feel much better about skipping my swim workout tonight, due to being too tired (read:lazy). Ugh, admitting that makes it much worse. Especially in front of you guys |
|
2009-10-20 4:25 PM in reply to: #2469487 |
Champion 5781 Northridge, California | Subject: RE: I'm a data junkie - interesting none the less Someone else here within the last few months posted that they had done a similar analysis and come up with what sounded like the same basic results: Low correlation between overall finish and swim finish, strong correlation with bike and run. Don't remember who it was that posted that, but he mentioned the run turning out to be the strongest predictor of the three. Not true with these data, but pretty darn close between bike and run. Interesting stuff. |
2009-10-20 4:25 PM in reply to: #2469487 |
Expert 1087 Portland | Subject: RE: I'm a data junkie - interesting none the less Geeky, yes, interesting, very! Do what you love because it makes you happy! Looking at those plots, it seems like this trend even holds true for the pros (for the most part). I think one big problem with looking at this data (after I posted I thought of this) is that they are pushing their bodies to the limit, and with such a "small" sample size its really hard to find as much of a trend as we see in the just the masses. With them pushing so hard, you have bodies shutting down and DNFing. This is very interesting and I think it would be an interesting thesis for someone to play with for a PhD. If I had to guess, this correlation comes from how difficult swimming is compared to biking and running. When you bike and run, if you just move your feet faster, 99% of the time you will move faster. With swimming that is not the case, it is SO technique based, and some people have a hard time trying to find that technique. I don't think that this means that you can skip swim workouts, in fact, the better shape you're in for the swim, the better off you'll be for the bike and consequently, the run. Edited by menglo 2009-10-20 4:29 PM |
2009-10-20 4:26 PM in reply to: #2469678 |
Extreme Veteran 541 Colorado | Subject: RE: I'm a data junkie - interesting none the less tkd.teacher - No, no, no - no excuses here. I've been going to my master's swim class faithfully twice a week for the past 6 months, and my swim has improved a lot. I'm currently working with the instructor to break my swim down bit by bit, and then reconstruct it, hopefully with fewer issues. 8^P And, in the first analysis of the sprint distance, I'm the red square in the graphs. Hmm....funny. I'm the red square. That seems about right. I didn't run the 5430. I thought the results for the sprint distance would be badly skewed because of all the beginners (aka "noise") in the data. I was surprised to see it generally hold on the HIM. Edited by KirkD 2009-10-20 4:29 PM |
2009-10-20 4:32 PM in reply to: #2469690 |
Extreme Veteran 541 Colorado | Subject: RE: I'm a data junkie - interesting none the less menglo - 2009-10-20 3:25 PM Geeky, yes, interesting, very! Do what you love because it makes you happy! Looking at those plots, it seems like this trend even holds true for the pros (for the most part). I think one big problem with looking at this data (after I posted I thought of this) is that they are pushing their bodies to the limit, and with such a "small" sample size its really hard to find as much of a trend as we see in the just the masses. With them pushing so hard, you have bodies shutting down and DNFing. This is very interesting and I think it would be an interesting thesis for someone to play with for a PhD. I'm with you 100% on the sample size for pros. I may see if I can collect the Pros' IM data for the entire season and do the same analysis. If I'm feeling up to it, I may break it down further into subgroups as one poster suggested. Can I get an honorary PhD for this work? 8^) |
2009-10-20 4:42 PM in reply to: #2469678 |
Champion 5781 Northridge, California | Subject: RE: I'm a data junkie - interesting none the less tkd.teacher - 2009-10-20 2:21 PM Oh, and I don't find a Kirk in the triathlon results... John Maybe a search by the last name on KirkD's BT profile page might turn something up for that race... Edited by tcovert 2009-10-20 4:44 PM |
|
2009-10-20 4:43 PM in reply to: #2469487 |
Pro 5892 , New Hampshire | Subject: RE: I'm a data junkie - interesting none the less If you have buoyancy of granite, should you stop sweating the swim then as well? |
2009-10-20 4:49 PM in reply to: #2469712 |
Extreme Veteran 541 Colorado | Subject: RE: I'm a data junkie - interesting none the less tcovert - 2009-10-20 3:42 PM tkd.teacher - 2009-10-20 2:21 PM Oh, and I don't find a Kirk in the triathlon results... John Maybe a search by the last name on KirkD's BT profile page might turn something up for that race... Ah - different question than my "red square" answered. Kirk is my middle name, so my tri results are under "Robert" my first name. Look for a Robert that came in 80th overall. Check my race log, too. |
2009-10-20 5:49 PM in reply to: #2469487 |
Expert 626 | Subject: RE: I'm a data junkie - interesting none the less Beyond geeky but very interesting |
2009-10-20 7:02 PM in reply to: #2469487 |
Elite 5316 Alturas, California | Subject: RE: I'm a data junkie - interesting none the less Another more swimmer friendly way to look at the data would be to weight the scores by the amount of time it takes the mean/average person to finish them. So if the average swim time for an IM were 1:30:00 and average bike time where 7:00:00 and the average run time were 5:30 then they would not be equally weighted measures predicting total outcome. Practically speaking the bike has more minutes so it has a heavier weight on the results. If we equalized the weight and then reanalized the numbers it would be interesting to see what we find. Total time for my random numbers= 14:00:00. S=10.7%, B=50% and R=39.3% of the total time of the IM. If we equalize the weighted contribution of s/b/r does that make swim correlate better? As it is swim gets a raw deal as it is only allowed 1/4 to 1/5 of the variance of the other two thus limiting its impact on the whole. |
2009-10-20 7:31 PM in reply to: #2469487 |
Coach 10487 Boston, MA | Subject: RE: I'm a data junkie - interesting none the less Could the data just be representative of how in general triathletes are terrible swimmers given the little emphasis they place on swim training? My guess is that if you would do the same analysis for pro racing (i.e. ITU racing) a different picture would emerge. |
|
2009-10-20 7:45 PM in reply to: #2469487 |
Master 1584 Fulton, MD | Subject: RE: I'm a data junkie - interesting none the less I've haven't read all of the replies, but I just wanted to say Dude, you rock hard. That is all. :-) |
2009-10-20 11:18 PM in reply to: #2469956 |
Extreme Veteran 541 Colorado | Subject: RE: I'm a data junkie - interesting none the less Baowolf - 2009-10-20 6:02 PM Another more swimmer friendly way to look at the data would be to weight the scores by the amount of time it takes the mean/average person to finish them. Interesting idea. If I have time I may give that a try. At this point I was only looking at discipline ranking vs overall ranking to see what the relationship was for placement. The take home from what I've done is that placing low in the swim (MOP, probably) does not preclude a good placement overall. I like the time spent idea with differential weighting. Jorge said: Could the data just be representative of how in general triathletes are terrible swimmers given the little emphasis they place on swim training? I'm sure you're right. I haven't done the analysis for pros yet with enough data to really be trustworthy. I may see if I can collect the entire season's IM Pro data and repeat the process for that set. There would be repeated data of sorts - same athlete in different races - but I'm not sure that will matter. The conditions are still somewhat independent due to different courses and different mix of competitors. Anybody have thoughts on that?? jcnipper said: I've haven't read all of the replies, but I just wanted to say Dude, you rock hard. Why, thanks. 8^) If I spent as much time training as I spend thinking about data, I WOULD rock. -Kirk |
2009-10-21 12:57 AM in reply to: #2469956 |
Champion 5781 Northridge, California | Subject: RE: I'm a data junkie - interesting none the less Baowolf - 2009-10-20 5:02 PM Another more swimmer friendly way to look at the data would be to weight the scores by the amount of time it takes the mean/average person to finish them. So if the average swim time for an IM were 1:30:00 and average bike time where 7:00:00 and the average run time were 5:30 then they would not be equally weighted measures predicting total outcome. Practically speaking the bike has more minutes so it has a heavier weight on the results. If we equalized the weight and then reanalized the numbers it would be interesting to see what we find. Total time for my random numbers= 14:00:00. S=10.7%, B=50% and R=39.3% of the total time of the IM. If we equalize the weighted contribution of s/b/r does that make swim correlate better? As it is swim gets a raw deal as it is only allowed 1/4 to 1/5 of the variance of the other two thus limiting its impact on the whole. Questionable assumption there. Specifically (and I think another BTer addressed this in passing in another thread a few months ago) is that the spread of finishing times in bike vs. run is not the same (different distribution curve...flatter for the run...e.g., slowest bike time in IM is about 1.89x the fastest, slowest run is 2.65x the fastest), which accounts for why you can argue for a fairly equitable correlation betw. bike finish and run finish to overall placement. |
2009-10-21 1:14 AM in reply to: #2469487 |
Master 2460 | Subject: RE: I'm a data junkie - interesting none the less I like the data. Pretty cool - I'm familiar with these coming from a science background.My thoughts though - - Better Bike/run correlation with race performance likely has a lot to do with the extensive x-training effects on the legs that overlap between the two sports. Hence, if you're a strong cyclist, odds are good that you'll be a respectable runner, and vice-versa. This will lead to a greater correlation of the lower-body related sports on the overall race result compared with the swim. - This bike/run phenomena can definitely be exploited by poor swimmers to climb the standings, as strong swimmers can only excel on one leg of the race, whereas the x-over allows for a predominantly strong runner or cyclist to still do significant damage on the other run/bike discipline, even with very limited training. - Despite this bike/run bias, the swim is critically important if you want to compete for an AG spot. You don't have to win it, but you have to do well on it in Oly/sprints. Even front of MOP likely won't be good enough. AG winners in SoCal are consistently in the top 10% AG, if not top 5% on the swim. Especially for Oly/Sprints, where the swim is a significant portion of the race. |
|