Showing posts with label Clayton Kershaw. Show all posts
Showing posts with label Clayton Kershaw. Show all posts

Monday, September 19, 2016

Buster Posey, the Dodgers Rotation, and Enums

Earlier we went over the major data types, the Strings, Ints, Floats, Doubles, and Bools of the Swift programming world. These data types can cover a lot of ground and handle a lot of different situations, but they can be like fitted caps inherited from older siblings: Nice to have, but they don't quite fit as well as one that is your actual size.



That's where enumerations, classes, and structures come in. They are data types that you can customize in Swift. We'll start off with enumerations, aka, enums.

What is an enum?
An enumeration - or enum - is akin to its meaning:



Now let's talk a closer look at what this meanings in terms of Swift coding.

In a previous post, we talked about switch statements and how we used switch statements if we thought that there were multiple different outcomes to a function. The example we used was of Grady Little going out to visit Pedro Martinez in 2003 and how Little could stick with Pedro, pull Pedro for a righty, pull Pedro for a lefty, or pull Pedro for a knuckleballer. We also included a default which is required for switch statements: "Call up Bobby Valentine for an emergency mustache and shots at Stan's."

Unlike switch statements, enums do not require defaults.

Why are enums important?
At first glance, especially with other options such as switch statements, classes, and structures, it isn't exactly clear what makes enums so important. For the best explanation, I'm going to hand it over to a Quora response from Brent Royal-Gordon who puts it like this:



When do we use enums?
If switch statements are particularly good at handling potentially open-ended circumstances or cases thanks to its default option, then enums are particularly good at handling well-defined cases.

Let's take a look at how Buster Posey hits against the 2016 Dodgers Rotation of Clayton Kershaw, Kenta Maeda, Rich Hill, Julio Urias, and Ross Stripling.



Here we've created a DodgersRotation enum using the enum keyword first and then capitalizing the name of the enum. After opening up some curly brackets we set out our different Dodger starting pitching cases from Kershaw through Stripling. Then we close up our enum with the second half of the closed curly brackets on line 11.

Let's start with how Posey does against Kershaw. On line 13 we create a variable, busterVersusDodgersRotation and define it not as a String or Int, but as a DodgersRotation data type. To focus on Kershaw, we choose the Kershaw case of the DodgersRotation enum (ie, DodgersRotation.Kershaw). Because we've selected the Kershaw case, line 19 prints out on the right. Hitters have a career .566 OPS against Kershaw so Buster is slightly above average.

But what if you don't want to keep updating Posey's OPS in the case print out statement? Can't we just set whatever Buster's OPS is to the proper pitcher's case and then use string interpolation to print it out in the print statement? Yes, we can. Check it out.



So we set Buster's OPS values against each Dodger starter up top in the enum case by case. Once we do that and setup string interpolation in the switch statement, the OPS figures print out on the right (line 20). So far, Maeda's done a nice job against Buster.

Enums and Raw Values
Enums can also store raw values. According to Swift documentation, "enumeration cases can come pre-populated with default values (called raw values), which are all of the same type." That's nice, but what does it mean? Let's take a look:



Let's say we wanted to assign numbers to the Dodgers starters, starting with their #1, Clayton Kershaw. First, we declared that the DodgersRotation enum would be an Int. Then we assigned a value to the first case, in this case 1 to Kershaw. Then, in line 13 we called the rawValue of Kenta Maeda and in line 14 we got 2. But we never explicitly assigned 2 to Maeda. How did it do that? That's the power of raw values in enums.

We can also use raw values with Strings.



By declaring the DodgersRotation enum to be a String on line 5, the computer knows that each case will also be a String so when we call the Urias rawValue on line 14 it knows to print out the value "Urias" when we call for it through string interpolation. It automatically assigns the name of the case to be a String of the same name. case Urias = "Urias".

Now, if we want case Kershaw to be "Clayton", we will have to write, case Kershaw = "Clayton". At that point, the computer won't know what to assign the rest of the Maeda, Hill, Urias, and Stripling cases unless it is personally on a first name basis with the rest of the Dodgers' rotation.

Enums and Methods
What else can we do with enums? We can run methods on the enum types we create. Let's use methods to find out Buster Posey's OPS differential against two different Dodger pitchers, Clayton Kershaw and Ross Stripling.



Let's walk through this. First, we created the Posey enum which gives us the Posey data type that we'll run our method on on line 32. Next we'll create two different cases, the Kershaw case and the Stripling case as we'll put up Buster's numbers against each. Then we'll create a function, opsDifferential,  that subtracts Buster's career OPS (.849) from whichever pitcher case he's facing, Kershaw or Stripling. Because OPS uses decimals, we'll use Doubles to handle whatever OPS versus either Kershaw or Stripling that Buster has and for whatever the outcome of the equation will be.

On line 22 we declare a switch on a self. What is self? Self is not an easy concept as it gets rather abstract rather quickly. In this case, self refers to the Posey enum. So can we write "switch Posey" and get away with it as if they are synonymous? Good question.  Let's try. Here's what happens:



Basically, Xcode can take "switch Posey", but then it wants us to pass an argument in those parentheses (line 22). Unfortunately, we have nothing to pass and even if we left the parentheses empty Xcode will throw errors noting that it has no accessible initializers. In other words, Xcode wants a lot more from us than we need to give it to do what we need to do. Ergo, writing "switch self" is preferable as it references the Posey enum without having to worrying about all the other stuff.

Back to line 23! What do we want our function to do? We want it to figure out the difference between Buster's career OPS and his OPS against a specific pitcher, in these cases either Kershaw or Stripling.

Outside and below the differential function, we then create a new variable, the vsKershaw variable on line 32. To the vsKershaw variable we run the Kershaw method on the Posey enum data type. Posey.Kershaw has a lot of shorthand built into it so let's unpack it for a minute. Posey.Kershaw is a quick ways of saying, "Run the Kershaw method, that is, subtract Buster's career OPS of 0.849 from his OPS against Kershaw."

But wait! You say. We haven't told the computer what Buster's OPS against Kershaw is yet! You're right. Let's do that on line 33 where we'll create a variable, opsVsKershaw, and assign Buster's current career OPS against Kershaw (0.529) to opsVsKershaw.

So what do we have? We have a Kershaw method on a Posey enum set to the vsKershaw variable and we have Buster's career OPS versus Kershaw set to the opsVsKershaw variable. Now what?

So we have all the parts, now we need to put them together to make the function go. That's what happens on line 35. Line 35 reads like this, "Run the Kershaw method (as opposed to the Stripling method) on the Posey enum through the opsDifferential function which subtracts Posey's career OPS from his OPS versus Kershaw." That's a bit of a mouthful, but hopefully it makes more sense to you than opsDifferential = vsKershaw.differential(opsVsKershaw).

On line 36 we tell the computer to print our results to the console using string interpolation which prints out, "Posey's OPS against Kershaw is -0.320 compared to his career OPS of .849."

As you can also see through this particular example, it is important to name things that make sense to not only you, but to other developers who may look at the code in the future. If you name things appropriately others should be able to read your code and make some sense of it.

That's a good start on enums. We'll do more in the next post.

Special thanks to Benedikt Terhechte who did a thorough job of explaining enums and raw values here

Challenge: Run through the examples above with a different hitter against different pitchers. Baseball-Reference.com is good at providing stats.

On Deck: Associated values and pattern matching in enums.



Wednesday, June 22, 2016

Clayton Kershaw, Floats, and the Incredible Shrinking WHIP

In our last post, we talked about the Swift integer or Int data type and ended up with a glaring problem: We couldn't use Int data types to figure out Cy Young's WHIP. Namely, because the result isn't an integer or whole number, it's a decimal.

Unfortunately, Swift doesn't have a decimal data type. Instead, it has Float and Double data types. Today we'll go over Floats and we'll give calculating WHIP another shot this time using Clayton Kershaw as our example.


What is a Float?
A Float is a data type like string and Int that accounts for decimals and fractions up to 14 places past the decimal point.

Why are Floats important?
As we saw when trying to calculate Cy Young's WHIP, Floats are important because they help us work with numbers that aren't integers or whole numbers, but rather numbers that are fractions or decimals.

When do we use Floats?
We typically assign Float data types to variables and constants we know will require short decimals. In baseball, this includes batting average, on-base percentage, slugging percentage, OPS, WHIP, ERA, innings pitched, and fielding percentage to name the biggies. Here are some examples:



Baseball stats rarely pass the decimal point by more than four places as seen above. This is perfect for Floats which are best for decimals of up to 14 places.

Another good reason to use Floats is to account for uncertain numerical outcomes. Classic cases of uncertain numerical outcomes occur when the output of a function is unknown going into the problem. For example, let's say Kershaw pitched a complete game (nine innings) giving up eight hits and one walk. This gives us a WHIP of 1 (8+1/9 = 1). We could have used the Int data type here because we had all whole numbers (9 innings, 8 hits, 1 walk, 1 WHIP).

But what are the chances over the course of an entire season or career that Kershaw's WHIP will always be an integer? He may throw a couple no-hitters (0 WHIP), have a bunch of solid one WHIP games, and maybe some bad 2 WHIP games, but chances are his WHIP will be a Float. Because this is a highly likely outcome it makes sense to assign the Float data type to associated variables and constants early on so they can account for these uncertainties later on.

So let's try calculating Kershaw's career WHIP (as of 6/19/16) using Floats:



Maybe you're thinking, wait, hits allowed and walks allowed will always be Ints! In the same way that strings and Ints don't play well together, neither do Ints and Floats.

There is a way to keep hits and walks allowed as Ints called type casting. We'll get to it down the road.

On Deck: Doubles