How did Lucas Sims show himself ready for the majors?

There’s been a lot of discussion of this on the Braves radio and telecasts, but Lucas Sims seems to be one of those few players who had serious control issues and managed to solve them.  His ERA has always been okay to bad, and his strikeout numbers have been fine, but his stats were undermined by a very high walk rate.  Here are two charts that show his progress.  The seasons and partial seasons on the X axis are:

  1. 2015 season with A-advanced Carolina (Carolina League)
  2. 2015 season with AA Mississippi (Southern League)
  3. 2016 season with AA Mississippi (Southern League)
  4. 2016 season with AAA Gwinnett (International League)
  5. 2017 season with AAA Gwinnett (International League)
  6. 2017 season with Atlanta (National League)

For each chart, I’ve put Sims’ numbers along with the league average for that season for comparison.

So we see that while his ERA was actually slightly below average at Gwinnett this year, the bigger issue is that he managed to lower his walks per 9 innings from 6.7 in his 2016 Gwinnett season to 2.8 at Gwinnett this year.  His control has carried over into the majors as he has only a 1.5 BB9 in his first two games.  Let’s see how well he can do today against the Cardinals, who are fighting desperately for the National League central division.


Treemap Comparing the Salary and WAR of the Braves

This chart includes everyone on the current 40-man roster (and so includes players on the DL, excludes players the Braves have traded, etc.) who contributed this year.  I did not include salaries for non-roster players the Braves may still be paying.  The size of the square is their salary, the color depth is their WAR, and their location is where they play.  Batters to the left, pitchers to the right, infielders to the left of outfielders, relievers below starters.  The WAR and salary data come from, and for players without a listed salary, I assumed they make the league minimum (which is probably very close to accurate).


What we see is that Freeman is the most valuable player, Inciarte is a steal at his salary, and we’ve been getting a lot out of our catchers.  Nothing too surprising there.  Take a step back, and see that our pitchers are all about equal, and salary doesn’t matter.  Kemp is the standout here, where even though he’s hitting .290 with 14 HRs at the moment, his poor defense has more than eliminated his offensive value.  For fun, lets’ take a look at the same chart but just isolating the dWAR (Defensive Wins Above Replacement):


This again shows the value of Inciarte and Suzuki, the decline of Markakis compared to other right fielders around the league, and the hole in left that is Matt Kemp. Also notable is that Swanson’s defense is still valuable, even if his offense has struggled.



Johan Camargo and the History of Interesting Baseball Injuries

Yup, it’s true. Johan Camargo injured himself coming on to the field. He is certainly not alone. Here are some links to some fun (to the limited extent that injuries can be fun) baseball injury stories.

Some all-timers from an old ESPN page

A fun list from SB Nation

From Mental Floss (one of my favorite sites)

A Fox Sports list (with pictures!)

In any case, it looks like Johan will only be out for a couple of weeks, and the Braves look to be returning Dansby Swanson to the majors to replace him.

Freddie Freeman having an all time RISP season…

Check out the video put out by the Braves on Twitter. Love the staff that works that feed, and I especially love them when they get their stats out. Freddie Freeman is having an amazing season, and we’ll have to return to this question at the end of the year to see if he made it, and what his total stats would’ve been like without the injury.

Which Braves Transaction has had the Largest Impact on Their Record?

There have been quite a few events in the Braves season, and I decided to investigate what impact those events have had on the Braves record this season. The five events I chose are the injury to Freddie Freeman, the signing of Matt Adams, the designation (and eventual release) of Bartolo Colon, the return of Freeman, and the trading of Jaime Garcia to the Twins. This chart shows the relationship of the Braves record to .500 after each game this season, with those events plotted. For fun, I also added a 20 game moving average.

What we can see is that the season has broken up into 4 “periods”.  First, the dramatic streakiness of the first few weeks of the season.  Second, the relative stability at around 6 games below .500 when Freeman was injured and Adams was brought in.  Third, the high point of the summer when the Braves were around 2-3 games under .500 and hit the .500 mark after finishing a sweep of Diamondbacks on July 16th.  Fourth, the current period in which the Braves have seen a precipitous decline after trading away Jaime Garcia.  This could be explained by a turn to younger pitchers, a feeling in the clubhouse that the team had given up on the season, or the loss of a positive dugout psychological force in Garcia.  That’s probably not answerable, but it does seem that the trade of Garcia has had more an impact on the record than the other four events.  Of course, the next month or so will tell us if the Braves are in a hiccup or a new “normal” down around 8 games below .500 or worse.


What Has Gone Wrong with Julio Teheran this Season?

At the beginning of the season, we pondered whether Julio Teheran was going to be able to retain his number one starter status (he is a 2-time all star, after all), or would he regress to a 2nd or 3rd starter? As it turns out, it may have fallen even farther. Of the 70 pitchers who have pitched at least 109 innings (the number of games the Braves have played), Julio is ranked 59th in ERA. He has pitched like a 5th starter. Why has this happened?

First, let’s look at some of his stats per 9 innings.

All of these are “bad” stats (lower is better), except for the light blue “strikeouts per 9 innings” number. All of them have moved in the wrong direction, all of them the worst of Julio’s career since he became a full-time starter in 2013. We mentioned in the post at the beginning of the season his fastball velocity. Here are the numbers, according to his fangraphs page.

That’s not good.  It’s only a little under 2 mph difference, but a sub 92 mph fastball is a problem for many pitchers.  It means the batters have slightly more time, and there is that much less difference between his fastball and change-up.  He’s also serving up home runs at a crazy-high rate (he’s tied for 64th among those 70 pitchers with 109 innings pitched).

Given his loss in velocity and strikeout rate, he will have to start pitching to location to get more awkward swings to produce outs.  Since his walks and home runs are way up, that doesn’t seem to have happened.  At this point, he’s a 4th-5th starter, and will have to improve mightily to get back to his salary expectations.  His current WAR on Baseball-Reference is 0.0 on their version of WAR, and -0.5 on the Fangraphs version.  We’ll have to see if he can turn things around this season, but it doesn’t look good so far.


Is Foltynewicz trending in the right direction?

Mike Foltynewicz starts this evening for the Braves, and he is closing in on the end of his third year on the club.  He’s also now 25 years old, so we ask: is he becoming more effective?  Let’s look at the Game Log data from  We’ll exclude games he appeared as a reliever (one game this season, three in 2015), which gives us 57 starts for the Braves.  What kind of trend lines do we see?

Here we have a plot of every start Folty has had for the Braves.  The blue line is the linear trend line, clearly going down.

Here we see the three season as three separate box plots.  Each shows the median earned runs, lowest, highest, and the 25th and 75th percentile.  So, 50% of his starts were inside the box.  Not only are his earned runs per game decreasing, they are less widely distributed (excepting the two high outliers from this season noted in the chart).

Interestingly, he doesn’t seem to have a strong correlation between his strike% (the number of strikes as a percentage of pitches) and either earned runs or Game Score (the Bill James numerical measure of the effectiveness of a start).  It would appear being “effectively wild” is a big part of his game.   He has already totaled his WAR from last year, and we’re at the beginning of August, so it seems the answer to our question is yes, Foltynewicz seems to be trending in the right direction.  Of course, this kind of statement can be a kiss of death, coming right before he gives up 8 runs in an inning, but that’s why you play the games.


How likely is it that Nick Markakis will get to 3,000 hits?

Last night, Nick Markakis became the latest major leaguer to accumulate 2,000 hits. He’s 10th on the active list as of today. The nine ahead of him are:

1. Ichiro Suzuki (3060)
2. Adrian Beltre (3002)
3. Albert Pujols (2918)
4. Carlos Beltran (2699)
5. Miguel Cabrera (2608)
6. Robinson Cano (2318)
7. Matt Holliday (2067)
8. Jose Reyes (2052)
9. Victor Martinez (2022)

Given that he’s 33 years old, I’m wondering what his chances are at getting to 3,000.

Let’s start with the current set of batting average and plate appearances for major leaguers as of last year. We’ll restrict it to players with at least 501 Plate Appearances, to give us a good read on those who were healthy everyday players, and would qualify for awards like the batting title.

Looks like something close to a normal distribution to me, with a big drop off in Plate Appearances at around 35. How about Batting Average?

This is all of the Batting Averages for 2016 players with at least 501 plate appearances. The blue line is a simple quadratic regression. Note how the line goes up for younger and older players. We would assume this is because younger players need to be able to hit to get on the field, and older players (who can’t run, or play defense as well) need to hit to stay on the field.  In the newest installment of “Name That Outlier!”, can you guess who that dot on the right is, the 40 year-old who hit over .300 last year?  ….


It was David Ortiz.

So, without going into the probability calculations, we can see that while his average will likely stay where it is, his availability will probably start a deep decline in the next 2-3 years.  As it is, he is second in career fielding percentage as a right fielder. He’s only made 6 errors in his nearly 3 years with the Braves. While his range (already below league average for right fielders) may go down, I can’t think of a good reason he’ll become more error prone. In any case, the Braves has been paying him $11 million per year for just under 2.0 WAR per year, and he’s under contract through the end of next year. Fangraphs, which uses a different calculation for WAR, uses the price of Free Agents and player’s WAR to calculate a player’s monetary value. They say that Markakis was worth $11.8 million in 2015 (slightly higher than his actual salary), $9.1 million in 2016 (lower than his salary), and has been worth only $700,000 so far this year due to reduced defensive range and power. Check out his Fangraphs page here.

Dealing with “TOT” rows in data from

This information may be out there already, but I couldn’t find it quickly. If we are using data from, we will come across situations often when a player played for two or more teams in the same season. How this is dealt with on that site is to put a “TOT” row first which totals the data from the other rows, and then follows that data with the information from the various teams. Most of the time, in my experience, what I want is just the “TOT” data, and I want to ignore the team specific data as superfluous. Here’s how to do that in Excel, and in R. Sadly, this functionality is missing in OpenOffice/LibreOffice Calc, and the path for removing duplicates is much to complicated for me to put up with for normal workflow.

The example I’m using is the .CSV of all of the batting statistics for players in 2016. The data is from, and you either copy and paste the data into a spreadsheet and save it as a CSV for R, or you can use the “Share & More” button above the table to convert the table to a comma-delimited version, copy that into a spreadsheet, and use the “Text to Columns” function (under ‘Data” in Excel).

In Excel:

You start with the data loaded from the link above (watch out to delete the LgAvg or “League Average” row), and so will have 1,611 rows.

1) Select all cells using the sweet multi-selector to left of Column A and above Row 1.
2) Under the “Data” tab, select “Remove Duplicates”
3) Click on “Unselect All” and Click on the Box next to “Name”
4) Click “Ok”

You should have a message box pop up and let you know that 258 duplicate rows have been removed, leaving you with 1,354 rows (including the header row).

In R:

1) Load the .csv into R with a command like:

stats2016 <- read.csv("c:/2016stats.csv", header=TRUE)

(This is obviously for Windows, but is fairly similar on Linux and Mac, you may have to use slashes of a different direction.)

2) If you type:


You’ll see we have our 1,611 rows.

3) There are multiple options to get rid of the rows we don’t need, but one simple way is to type something like:

nonDupes <- stats2016[!duplicated(stats2016["Name"]),]

(This removes all of the rows in which the second column has a duplicate.)

4) If we type:


This tells us that our variable has 1,353 rows (it doesn’t count the header row, since we told it there was a header row when we loaded it).

5) So, if you wanted to isolate stolen bases from this data set, you would type something like:

SBs <- data.frame(nonDupes$Name, nonDupes$SB)

There are loads of tutorials for both Excel and R if these were more complicated than you’re ready for, but if you’re like me and just need the workflow steps to work with data, I hope this helps.