Why not?
Re: Why not?
Swarthy. I've missed these. 

-
roncli
- Posts: 1106
- Joined: Sun Mar 22, 2015 5:05 pm
- Location: Belmont, CA
Haha - nice!
Mel - I'm not sure how you got the impression Tom designed the algorithm. He did a lot of supporting analysis, but it's my design.
As to making it public - here's the thing. Games have visible and invisible elements - things you're supposed to manipulate and things you aren't. The visible elements, the scores and ammo and enemy ai, are kept simple and sensible and directly controllable because that makes it something players can enjoy manipulating. The invisible elements, on the other hand, are things designed to keep things moving and fun, and are things like the computer deciding when and where to spawn enemies, teleporting them to keep them close to you, things like that. They can be quite arcane and seem like cheating, and it's difficult to exert control over them... but done right, you know the effects.
The ladder algorithm is designed aa an invisible element. If it was supposed to be fun and interesting to play for, I would have designed it much, much differently than I did. As it is, you have very little control over your score. Some of the biggest shifts you see happen when someone else has a good or bad game. It measures you on a rolling 90 day average, so changing it for better or worse takes a lot of games over a long time. You can change it... by consistently doing better or worse against everyone over a period of months. What you cannot do is make it go up when you have a great game. It often goes down.
Anyone try to lose weight? They tell you not to weigh yourself every day because the movements of the scale day to day are random, and it will only serve to frustrate you. You weigh yourself at a long enough interval that there's some signal in the noise because *that* is motivating.
The main design goal of the algorithm is to be as accurate as it can be with as little data as possible. It is a novel algorithm based on the way Descent players have historically evaluated each other: "He got 15 on Melvin, he muat be okay." In a nutshell, each player's scores against each other are averaged to form a set of pressures trying to bring two pilots together or push them apart, and then an evolutionary algorithm is applied which seeks a set of pilot ratings that minimizes those pressures.
But it is more complex than that and has internal measures that - like teleporting enemies - would seem like cheating to a player seeking to manipulate the algorithm. For example, it has competing design goals in that I want it to rate a new player joining the ladder as fast as it possibly can and still be accurate. Sometimes this is just two games! But I also want it to offer promotions and demotions only when it's really, really sure. The design goal is that by the time it happens, no one should be surprised. So you will often seea pilots raw score above or below the threshold where it would normally trigger, while the algorithm waits for various forms of confirmation.
You think that would be motivating to watch? When I was going through the process of making gold, I literally did not look at the page for months at a time, and I wrote the darn thing and understand it better than anyone.
I *trust* it... but I don't wanna *watch* it while it works, particularly not on me.
And yet some watching is necessary. When we first launched the ladder, there were outright bugs as well as design issues - we found out a lot of ways it can get confused. The goal is to catch and fix those before they result in obviously wrong promotions or demotions. Poor Sirius was gold one week and silver the next for a while, until we got certain things better under control. One time Merk392 got dropped to silver and so many people came by to tell me that wasn't right that I made a mumble channel titled "MarksNotSilverItsABug" while working on the problem.
Yeah - we've found and fixed a lot of ways for it to fail, and I haven't needed to adjust it in - holy wow, has it been 18 months already? - though the folks who watch it can surely attest that some of that is neglect. I have talked about wanting to keep some things a bit more stable and haven't made the time to work on them. Real soon now. The baby's sleeping, I have energy to work through my professional backlog now, then I can get into the Descent backlog. Anyway.
Point is, there are always new and exciting ways for something like this to fail. We do need people watching, but as few as possible. I don't think locking it down to just Tom and me is a good idea, as the pilots I really want looking at it are the guys who play everyone, a lot, and can say "oh yeah he's sucking this week" vs "wtf is this, are we looking at a bug/design issue here?" Once upon a time that *was* Tom and me, but right now... it's not. But the other thing is, I want a few pilots who are familiar enough with the thing to have a sense of what's normal and what isn't, and that takes time to build. These are the same people I want making behind the scenes design decisions for the ladder - admins and, well, people who could step in and replace an admin.
I understand a natural curiosity about the thing. But I really don't advise playing for it. The only way to improve your score that I'm aware of is to do better against everyone on average over a period of months. Seriously. It's quite robust. I'm not aware of a way to improve your score faster than your piloting in general, and whenever such ways are found accidentally or intentionally, we treat them as flaws in the system and fix them. And anyway, why would you want an inflated score? So you can have tougher margina to play for in promo challenges? So you can be promoted early and feel extra vulnerable while pilots gun for you for silver wins?
We've talked about making snapshots of the ratings visible, like maybe at season switchover. And you do get snapshots, with promo challenge targets. But I think every time we've had the conversation, I've been persuaded that invisible game elements are better if they stay invisible. You know how knowing how the magic trick works doesn't make you enjoy it more? There are elements of game design that are like that. Anyone who has ever been a DM, hashtag nerdreference, would tell you that screen exists for a reason, and it wouldn't improve the experience for anyone to take it away.
There are lots of things to play for that aren't invisible. Promo challenges are one of them! These have simple rules, predictable and controllable outcomes! These are a rating system that is designed by the rulea for visible game elements rather than invisible ones. Optimize the heck out of your performance in one of those. It's fun.
Play for rank. Play for record. Play for trophies. Play for medals.Play to beat a pilot you've never beaten before, or get a score you've never gotten before. Play for buadrivers. If you're Mark, play to astonish your peers by doing what no one thinks possible. There are many, many goals you can set for yourself on the path to improving your piloting. Some set by the ladder, some set by your peers, some you choose for yourself. These are worthy and fun. Focus on them.
Of course you want to make it to the next level. We all do. But improving your piloting overall is a vague challenge to pursue, and watching your skill rating is a frustrating and self-defeating way to do it. This is a bad thing to focus on in the short term. This is a long term goal, and when that promo offer comes up, you want to feel... ready. You don't want to go gold as a trophy to put on your shelf - you want to go gold because silver is no longer the right place for you to be playing, because those pilots as a group no longer offer you fair competition.
I don't know, we can always improve. Maybe there are ways to resolve some of the concerns that the skill rating curiosity derives from without pressing the system into service it wasn't designed for. Maybe we can put your ratings on your profile on a quarterly basis, so you can see your progress? Maybe we can give you some more warning when a demotion might be coming, or limit them to quarterly reviews or something? I'm open to solving problems, for sure.
But when you say people who can see the ratings play for them - you have to mean Jeds. Jeds... is... a unique individual, and he does himself no service by caring so much about his rating, and he'd be the first to tell you that. He can't help it. None of the rest of us play for rating. And none of the rest of you should.
Mel - I'm not sure how you got the impression Tom designed the algorithm. He did a lot of supporting analysis, but it's my design.
As to making it public - here's the thing. Games have visible and invisible elements - things you're supposed to manipulate and things you aren't. The visible elements, the scores and ammo and enemy ai, are kept simple and sensible and directly controllable because that makes it something players can enjoy manipulating. The invisible elements, on the other hand, are things designed to keep things moving and fun, and are things like the computer deciding when and where to spawn enemies, teleporting them to keep them close to you, things like that. They can be quite arcane and seem like cheating, and it's difficult to exert control over them... but done right, you know the effects.
The ladder algorithm is designed aa an invisible element. If it was supposed to be fun and interesting to play for, I would have designed it much, much differently than I did. As it is, you have very little control over your score. Some of the biggest shifts you see happen when someone else has a good or bad game. It measures you on a rolling 90 day average, so changing it for better or worse takes a lot of games over a long time. You can change it... by consistently doing better or worse against everyone over a period of months. What you cannot do is make it go up when you have a great game. It often goes down.
Anyone try to lose weight? They tell you not to weigh yourself every day because the movements of the scale day to day are random, and it will only serve to frustrate you. You weigh yourself at a long enough interval that there's some signal in the noise because *that* is motivating.
The main design goal of the algorithm is to be as accurate as it can be with as little data as possible. It is a novel algorithm based on the way Descent players have historically evaluated each other: "He got 15 on Melvin, he muat be okay." In a nutshell, each player's scores against each other are averaged to form a set of pressures trying to bring two pilots together or push them apart, and then an evolutionary algorithm is applied which seeks a set of pilot ratings that minimizes those pressures.
But it is more complex than that and has internal measures that - like teleporting enemies - would seem like cheating to a player seeking to manipulate the algorithm. For example, it has competing design goals in that I want it to rate a new player joining the ladder as fast as it possibly can and still be accurate. Sometimes this is just two games! But I also want it to offer promotions and demotions only when it's really, really sure. The design goal is that by the time it happens, no one should be surprised. So you will often seea pilots raw score above or below the threshold where it would normally trigger, while the algorithm waits for various forms of confirmation.
You think that would be motivating to watch? When I was going through the process of making gold, I literally did not look at the page for months at a time, and I wrote the darn thing and understand it better than anyone.
I *trust* it... but I don't wanna *watch* it while it works, particularly not on me.
And yet some watching is necessary. When we first launched the ladder, there were outright bugs as well as design issues - we found out a lot of ways it can get confused. The goal is to catch and fix those before they result in obviously wrong promotions or demotions. Poor Sirius was gold one week and silver the next for a while, until we got certain things better under control. One time Merk392 got dropped to silver and so many people came by to tell me that wasn't right that I made a mumble channel titled "MarksNotSilverItsABug" while working on the problem.
Yeah - we've found and fixed a lot of ways for it to fail, and I haven't needed to adjust it in - holy wow, has it been 18 months already? - though the folks who watch it can surely attest that some of that is neglect. I have talked about wanting to keep some things a bit more stable and haven't made the time to work on them. Real soon now. The baby's sleeping, I have energy to work through my professional backlog now, then I can get into the Descent backlog. Anyway.
Point is, there are always new and exciting ways for something like this to fail. We do need people watching, but as few as possible. I don't think locking it down to just Tom and me is a good idea, as the pilots I really want looking at it are the guys who play everyone, a lot, and can say "oh yeah he's sucking this week" vs "wtf is this, are we looking at a bug/design issue here?" Once upon a time that *was* Tom and me, but right now... it's not. But the other thing is, I want a few pilots who are familiar enough with the thing to have a sense of what's normal and what isn't, and that takes time to build. These are the same people I want making behind the scenes design decisions for the ladder - admins and, well, people who could step in and replace an admin.
I understand a natural curiosity about the thing. But I really don't advise playing for it. The only way to improve your score that I'm aware of is to do better against everyone on average over a period of months. Seriously. It's quite robust. I'm not aware of a way to improve your score faster than your piloting in general, and whenever such ways are found accidentally or intentionally, we treat them as flaws in the system and fix them. And anyway, why would you want an inflated score? So you can have tougher margina to play for in promo challenges? So you can be promoted early and feel extra vulnerable while pilots gun for you for silver wins?
We've talked about making snapshots of the ratings visible, like maybe at season switchover. And you do get snapshots, with promo challenge targets. But I think every time we've had the conversation, I've been persuaded that invisible game elements are better if they stay invisible. You know how knowing how the magic trick works doesn't make you enjoy it more? There are elements of game design that are like that. Anyone who has ever been a DM, hashtag nerdreference, would tell you that screen exists for a reason, and it wouldn't improve the experience for anyone to take it away.
There are lots of things to play for that aren't invisible. Promo challenges are one of them! These have simple rules, predictable and controllable outcomes! These are a rating system that is designed by the rulea for visible game elements rather than invisible ones. Optimize the heck out of your performance in one of those. It's fun.
Play for rank. Play for record. Play for trophies. Play for medals.Play to beat a pilot you've never beaten before, or get a score you've never gotten before. Play for buadrivers. If you're Mark, play to astonish your peers by doing what no one thinks possible. There are many, many goals you can set for yourself on the path to improving your piloting. Some set by the ladder, some set by your peers, some you choose for yourself. These are worthy and fun. Focus on them.
Of course you want to make it to the next level. We all do. But improving your piloting overall is a vague challenge to pursue, and watching your skill rating is a frustrating and self-defeating way to do it. This is a bad thing to focus on in the short term. This is a long term goal, and when that promo offer comes up, you want to feel... ready. You don't want to go gold as a trophy to put on your shelf - you want to go gold because silver is no longer the right place for you to be playing, because those pilots as a group no longer offer you fair competition.
I don't know, we can always improve. Maybe there are ways to resolve some of the concerns that the skill rating curiosity derives from without pressing the system into service it wasn't designed for. Maybe we can put your ratings on your profile on a quarterly basis, so you can see your progress? Maybe we can give you some more warning when a demotion might be coming, or limit them to quarterly reviews or something? I'm open to solving problems, for sure.
But when you say people who can see the ratings play for them - you have to mean Jeds. Jeds... is... a unique individual, and he does himself no service by caring so much about his rating, and he'd be the first to tell you that. He can't help it. None of the rest of us play for rating. And none of the rest of you should.
-
Drakona
- Site Admin
- Posts: 1494
- Joined: Fri Aug 30, 2013 5:35 pm
Good explanation, Drakona. Thank you.
Although maybe it's time to look at the algo again, seems like a few months back it accidentally put silver wings on a smouldering compost heap
Although maybe it's time to look at the algo again, seems like a few months back it accidentally put silver wings on a smouldering compost heap

-
Maestro
- Posts: 122
- Joined: Mon Dec 07, 2015 9:05 am
My apologies for the mixup, Drak. I don't know where I got that idea from either, other than I thought Tom was the math-ier of the two of you.
Thanks for the response. For the record, i dont think it's just jeds, but he is the most transparent about his experience with it. I don't want to see it, being MCL champ is all the motivation I need.

-
melvin
- Posts: 515
- Joined: Thu Mar 20, 2014 11:23 pm